Table of Contents
Fetching ...

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

Weijie Chen, Yuhang Wang, Lin Yao

TL;DR

This work tackles heterogeneous cryo-EM reconstruction under extreme noise and unknown poses by introducing HetACUMN, a self-supervised, amortized-inference variational autoencoder. It jointly learns variational image reconstruction and a conditional pose-prediction task to explicitly disentangle pose and conformation latent spaces, improving conformational classification and enabling efficient heterogeneous reconstructions. Demonstrations on simulated datasets show superior or competitive performance against state-of-the-art amortized methods and non-amortized baselines, with a notably more concentrated latent-z space that facilitates discriminating multiple conformations; experiments on real spliceosome data reveal plausible conformational dynamics consistent with prior literature. The approach offers a scalable pathway to accurate, fast heterogeneous cryo-EM analysis, though future work is needed to capture more complex conformational variability beyond the chosen latent-space representation.

Abstract

Due to the extremely low signal-to-noise ratio (SNR) and unknown poses (projection angles and image shifts) in cryo-electron microscopy (cryo-EM) experiments, reconstructing 3D volumes from 2D images is very challenging. In addition to these challenges, heterogeneous cryo-EM reconstruction requires conformational classification. In popular cryo-EM reconstruction algorithms, poses and conformation classification labels must be predicted for every input cryo-EM image, which can be computationally costly for large datasets. An emerging class of methods adopted the amortized inference approach. In these methods, only a subset of the input dataset is needed to train neural networks for the estimation of poses and conformations. Once trained, these neural networks can make pose/conformation predictions and 3D reconstructions at low cost for the entire dataset during inference. Unfortunately, when facing heterogeneous reconstruction tasks, it is hard for current amortized-inference-based methods to effectively estimate the conformational distribution and poses from entangled latent variables. Here, we propose a self-supervised variational autoencoder architecture called "HetACUMN" based on amortized inference. We employed an auxiliary conditional pose prediction task by inverting the order of encoder-decoder to explicitly enforce the disentanglement of conformation and pose predictions. Results on simulated datasets show that HetACUMN generated more accurate conformational classifications than other amortized or non-amortized methods. Furthermore, we show that HetACUMN is capable of performing heterogeneous 3D reconstructions of a real experimental dataset.

Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement

TL;DR

This work tackles heterogeneous cryo-EM reconstruction under extreme noise and unknown poses by introducing HetACUMN, a self-supervised, amortized-inference variational autoencoder. It jointly learns variational image reconstruction and a conditional pose-prediction task to explicitly disentangle pose and conformation latent spaces, improving conformational classification and enabling efficient heterogeneous reconstructions. Demonstrations on simulated datasets show superior or competitive performance against state-of-the-art amortized methods and non-amortized baselines, with a notably more concentrated latent-z space that facilitates discriminating multiple conformations; experiments on real spliceosome data reveal plausible conformational dynamics consistent with prior literature. The approach offers a scalable pathway to accurate, fast heterogeneous cryo-EM analysis, though future work is needed to capture more complex conformational variability beyond the chosen latent-space representation.

Abstract

Due to the extremely low signal-to-noise ratio (SNR) and unknown poses (projection angles and image shifts) in cryo-electron microscopy (cryo-EM) experiments, reconstructing 3D volumes from 2D images is very challenging. In addition to these challenges, heterogeneous cryo-EM reconstruction requires conformational classification. In popular cryo-EM reconstruction algorithms, poses and conformation classification labels must be predicted for every input cryo-EM image, which can be computationally costly for large datasets. An emerging class of methods adopted the amortized inference approach. In these methods, only a subset of the input dataset is needed to train neural networks for the estimation of poses and conformations. Once trained, these neural networks can make pose/conformation predictions and 3D reconstructions at low cost for the entire dataset during inference. Unfortunately, when facing heterogeneous reconstruction tasks, it is hard for current amortized-inference-based methods to effectively estimate the conformational distribution and poses from entangled latent variables. Here, we propose a self-supervised variational autoencoder architecture called "HetACUMN" based on amortized inference. We employed an auxiliary conditional pose prediction task by inverting the order of encoder-decoder to explicitly enforce the disentanglement of conformation and pose predictions. Results on simulated datasets show that HetACUMN generated more accurate conformational classifications than other amortized or non-amortized methods. Furthermore, we show that HetACUMN is capable of performing heterogeneous 3D reconstructions of a real experimental dataset.
Paper Structure (24 sections, 11 equations, 16 figures, 3 tables)

This paper contains 24 sections, 11 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: Architectures of the two tasks in HetACUMN. (a) the variational image reconstruction task; (b) the conditional pose prediction (CPP) task.
  • Figure 2: Example of noisy EM image generation. (a) a noise-free reconstructed image. (b) a synthetic noisy EM image with SNR = -10 dB. The area outside the red circle is considered as the noise-only region.
  • Figure 3: Distributions of predicted conformations along the PC1 axis for the 20k (top) and 100k (bottom) 80S-bimodal datasets.
  • Figure 4: Visualization of the ground-truth volumes and predicted volumes from cryoDRGN2, cryoFIRE, and HetACUMN, reconstructed from the 20k (top) and 100k (bottom) 80S-bimodal datasets. The two conformational states are superposed with one state shown in color and the other in grey.
  • Figure 5: Distribution of the latent-$z$ for benchmark tests on the 1D-motion dataset (100k). (a) probability density distribution of the latent-$z$ along the PC1 axis after dimension reduction.; (b) violin plots for the latent-$z$ statistics for each class.
  • ...and 11 more figures