Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

Weijie Chen; Yuhang Wang; Lin Yao

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

Weijie Chen, Yuhang Wang, Lin Yao

TL;DR

This work tackles heterogeneous cryo-EM reconstruction under extreme noise and unknown poses by introducing HetACUMN, a self-supervised, amortized-inference variational autoencoder. It jointly learns variational image reconstruction and a conditional pose-prediction task to explicitly disentangle pose and conformation latent spaces, improving conformational classification and enabling efficient heterogeneous reconstructions. Demonstrations on simulated datasets show superior or competitive performance against state-of-the-art amortized methods and non-amortized baselines, with a notably more concentrated latent-z space that facilitates discriminating multiple conformations; experiments on real spliceosome data reveal plausible conformational dynamics consistent with prior literature. The approach offers a scalable pathway to accurate, fast heterogeneous cryo-EM analysis, though future work is needed to capture more complex conformational variability beyond the chosen latent-space representation.

Abstract

Due to the extremely low signal-to-noise ratio (SNR) and unknown poses (projection angles and image shifts) in cryo-electron microscopy (cryo-EM) experiments, reconstructing 3D volumes from 2D images is very challenging. In addition to these challenges, heterogeneous cryo-EM reconstruction requires conformational classification. In popular cryo-EM reconstruction algorithms, poses and conformation classification labels must be predicted for every input cryo-EM image, which can be computationally costly for large datasets. An emerging class of methods adopted the amortized inference approach. In these methods, only a subset of the input dataset is needed to train neural networks for the estimation of poses and conformations. Once trained, these neural networks can make pose/conformation predictions and 3D reconstructions at low cost for the entire dataset during inference. Unfortunately, when facing heterogeneous reconstruction tasks, it is hard for current amortized-inference-based methods to effectively estimate the conformational distribution and poses from entangled latent variables. Here, we propose a self-supervised variational autoencoder architecture called "HetACUMN" based on amortized inference. We employed an auxiliary conditional pose prediction task by inverting the order of encoder-decoder to explicitly enforce the disentanglement of conformation and pose predictions. Results on simulated datasets show that HetACUMN generated more accurate conformational classifications than other amortized or non-amortized methods. Furthermore, we show that HetACUMN is capable of performing heterogeneous 3D reconstructions of a real experimental dataset.

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

TL;DR

Abstract

Paper Structure (24 sections, 11 equations, 16 figures, 3 tables)

This paper contains 24 sections, 11 equations, 16 figures, 3 tables.

Introduction
Background
Related Work
Conformational classification
Pose estimation
Joint estimation of conformations and poses
Methods
Overview of HetACUMN
Variational image reconstruction task
Conditional pose prediction task
Noise Generation
Metric for measuring latent-space entanglement
Experiments
Reconstructions on Simulated Datasets
Reconstructions on an Experimental Dataset
...and 9 more sections

Figures (16)

Figure 1: Architectures of the two tasks in HetACUMN. (a) the variational image reconstruction task; (b) the conditional pose prediction (CPP) task.
Figure 2: Example of noisy EM image generation. (a) a noise-free reconstructed image. (b) a synthetic noisy EM image with SNR = -10 dB. The area outside the red circle is considered as the noise-only region.
Figure 3: Distributions of predicted conformations along the PC1 axis for the 20k (top) and 100k (bottom) 80S-bimodal datasets.
Figure 4: Visualization of the ground-truth volumes and predicted volumes from cryoDRGN2, cryoFIRE, and HetACUMN, reconstructed from the 20k (top) and 100k (bottom) 80S-bimodal datasets. The two conformational states are superposed with one state shown in color and the other in grey.
Figure 5: Distribution of the latent-$z$ for benchmark tests on the 1D-motion dataset (100k). (a) probability density distribution of the latent-$z$ along the PC1 axis after dimension reduction.; (b) violin plots for the latent-$z$ statistics for each class.
...and 11 more figures

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

TL;DR

Abstract

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

Authors

TL;DR

Abstract

Table of Contents

Figures (16)