Table of Contents
Fetching ...

CryoSPIN: Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference

Shayan Shekarforoush, David B. Lindell, Marcus A. Brubaker, David J. Fleet

TL;DR

This work proposes a new semi-amortized method, cryoSPIN, in which reconstruction begins with amortized inference and then switches to a form of auto-decoding to refine poses locally using stochastic gradient descent, and shows that cryoSPIN outperforms the state-of-the-art cryoAI in speed and reconstruction quality.

Abstract

Cryo-EM is an increasingly popular method for determining the atomic resolution 3D structure of macromolecular complexes (eg, proteins) from noisy 2D images captured by an electron microscope. The computational task is to reconstruct the 3D density of the particle, along with 3D pose of the particle in each 2D image, for which the posterior pose distribution is highly multi-modal. Recent developments in cryo-EM have focused on deep learning for which amortized inference has been used to predict pose. Here, we address key problems with this approach, and propose a new semi-amortized method, cryoSPIN, in which reconstruction begins with amortized inference and then switches to a form of auto-decoding to refine poses locally using stochastic gradient descent. Through evaluation on synthetic datasets, we demonstrate that cryoSPIN is able to handle multi-modal pose distributions during the amortized inference stage, while the later, more flexible stage of direct pose optimization yields faster and more accurate convergence of poses compared to baselines. On experimental data, we show that cryoSPIN outperforms the state-of-the-art cryoAI in speed and reconstruction quality.

CryoSPIN: Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference

TL;DR

This work proposes a new semi-amortized method, cryoSPIN, in which reconstruction begins with amortized inference and then switches to a form of auto-decoding to refine poses locally using stochastic gradient descent, and shows that cryoSPIN outperforms the state-of-the-art cryoAI in speed and reconstruction quality.

Abstract

Cryo-EM is an increasingly popular method for determining the atomic resolution 3D structure of macromolecular complexes (eg, proteins) from noisy 2D images captured by an electron microscope. The computational task is to reconstruct the 3D density of the particle, along with 3D pose of the particle in each 2D image, for which the posterior pose distribution is highly multi-modal. Recent developments in cryo-EM have focused on deep learning for which amortized inference has been used to predict pose. Here, we address key problems with this approach, and propose a new semi-amortized method, cryoSPIN, in which reconstruction begins with amortized inference and then switches to a form of auto-decoding to refine poses locally using stochastic gradient descent. Through evaluation on synthetic datasets, we demonstrate that cryoSPIN is able to handle multi-modal pose distributions during the amortized inference stage, while the later, more flexible stage of direct pose optimization yields faster and more accurate convergence of poses compared to baselines. On experimental data, we show that cryoSPIN outperforms the state-of-the-art cryoAI in speed and reconstruction quality.
Paper Structure (22 sections, 11 equations, 8 figures, 2 tables)

This paper contains 22 sections, 11 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: CryoSPIN consists of two stages: (i) an auto-encoding stage where an image encoder equipped with multiple heads maps the input image to the pose candidate set $\{\phi_1, \dots \phi_M\}$, followed by computing projections by slicing through the volume decoder in Fourier space based on the pose set. These projections are compared with the input image and the one with the minimum error is used. (ii) An auto-decoding stage where pose parameters in axis-angle representation are stored for all images. The same volume decoder is used to obtain projections, and the reconstruction loss is computed for a single projection.
  • Figure 2: Qualitative and quantitative comparison of reconstructions obtained by our proposed semi-amortized method, cryoSPIN, with cryoAI levy2022cryoai, cryoDRGN zhong2021cryodrgn2 and cryoSPARC punjani2017cryosparc We note that cryoAI often becomes stuck in local minima when particles are not centered, so for cryoAI we translate the input images to correct for the spatial offset. (Left) Final 3D reconstructions on three synthetic datasets and one experimental dataset (EMPIAR-10028) are depicted using ChimeraX goddard2018ucsf. (Right) FSC curves are visualized for quantitative comparison. The red dashed lines show the standard threshold levels of $0.5$ and $0.143$ to report the resolution (in Angstrom) for synthetic and real data, respectively. CryoSPIN achieves higher resolution on the Spliceosome and HSP datasets, and it is competitive with the state of the art on the Spike and EMPIAR-10028 datasets.
  • Figure 3: 3D resolution as a function of log time for different methods. These plots show the semi-amortized method is significantly faster than cryoAI.
  • Figure 4: Quantitative and qualitative comparison of fully- vs. semi-amortized methods in pose optimization for Spike (top) and Spliceosome (bottom) datasets. (Left) Mean geodesic distance between the predicted pose and ground-truth is visualized at different epochs. By switching from amortized inference to direct optimization, semi-amortized method (blue) enjoys accelerated pose convergence compared to amortized inference (red). (Right) To visualize pose inference, images depict the approximate log posterior for three particles (marginalized over in-plane rotations) as view-direction distribution over a HEALPix gorski2005healpix uniform grid on a unit sphere $S^2$. After Gnomonic projection to 2D, we show the neighborhood of the mode of interest. Black dots depict starting points. Blue and red dots are pose estimates from fully- and semi-amortized methods.
  • Figure 5: Comparing performance of our multi-head pose encoder ($M=4$) with cryoAI pose encoder on the challenging HSP goodsell2008pdb dataset. (Left) The approximate log posterior of view direction is visualized on the unit sphere with highlighted areas showing modes of the distribution. CryoAI and multi-head encoders provide two and four pose estimates, respectively, which are marked with colored dots on the sphere (the order of poses is arbitrary). Below the sphere, the corresponding projections are illustrated. CryoAI fails to find the correct mode while our method is able cover multiple modes. (Top, right) With our multi-head encoder, the reconstruction converges to a much higher resolution compared to cryoAI. (Bottom, right) Percentage of images assigned to each head is visualized as a bar plot confirming that all heads participate in pose estimation.
  • ...and 3 more figures