Table of Contents
Fetching ...

CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks

Jeffrey Gu, Minkyu Jeon, Ambri Ma, Serena Yeung-Levy, Ellen D. Zhong

TL;DR

CryoHype introduces a transformer-based hypernetwork that dynamically generates per-structure INR weights to address extreme compositional heterogeneity in cryo-EM. It achieves state-of-the-art FSC_AUC on Tomotwin-100 and scales to 1,000 structures in Sim2Struct-1000, outperforming CryoDRGN especially as heterogeneity increases. To capture heterogeneity beyond FSC, the authors propose real-space metrics (vIoU and CD) and a large Sim2Struct-1000 dataset, demonstrating more accurate shape recovery. The work highlights the expressivity advantages of hypernetworks for conditional INRs in large-scale heterogeneous cryo-EM reconstruction and discusses future directions including ab initio pose estimation and joint handling of conformational and compositional heterogeneity.

Abstract

Cryo-electron microscopy (cryo-EM) is an indispensable technique for determining the 3D structures of dynamic biomolecular complexes. While typically applied to image a single molecular species, cryo-EM has the potential for structure determination of many targets simultaneously in a high-throughput fashion. However, existing methods typically focus on modeling conformational heterogeneity within a single or a few structures and are not designed to resolve compositional heterogeneity arising from mixtures of many distinct molecular species. To address this challenge, we propose CryoHype, a transformer-based hypernetwork for cryo-EM reconstruction that dynamically adjusts the weights of an implicit neural representation. Using CryoHype, we achieve state-of-the-art results on a challenging benchmark dataset containing 100 structures. We further demonstrate that CryoHype scales to the reconstruction of 1,000 distinct structures from unlabeled cryo-EM images in the fixed-pose setting.

CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks

TL;DR

CryoHype introduces a transformer-based hypernetwork that dynamically generates per-structure INR weights to address extreme compositional heterogeneity in cryo-EM. It achieves state-of-the-art FSC_AUC on Tomotwin-100 and scales to 1,000 structures in Sim2Struct-1000, outperforming CryoDRGN especially as heterogeneity increases. To capture heterogeneity beyond FSC, the authors propose real-space metrics (vIoU and CD) and a large Sim2Struct-1000 dataset, demonstrating more accurate shape recovery. The work highlights the expressivity advantages of hypernetworks for conditional INRs in large-scale heterogeneous cryo-EM reconstruction and discusses future directions including ab initio pose estimation and joint handling of conformational and compositional heterogeneity.

Abstract

Cryo-electron microscopy (cryo-EM) is an indispensable technique for determining the 3D structures of dynamic biomolecular complexes. While typically applied to image a single molecular species, cryo-EM has the potential for structure determination of many targets simultaneously in a high-throughput fashion. However, existing methods typically focus on modeling conformational heterogeneity within a single or a few structures and are not designed to resolve compositional heterogeneity arising from mixtures of many distinct molecular species. To address this challenge, we propose CryoHype, a transformer-based hypernetwork for cryo-EM reconstruction that dynamically adjusts the weights of an implicit neural representation. Using CryoHype, we achieve state-of-the-art results on a challenging benchmark dataset containing 100 structures. We further demonstrate that CryoHype scales to the reconstruction of 1,000 distinct structures from unlabeled cryo-EM images in the fixed-pose setting.

Paper Structure

This paper contains 40 sections, 3 equations, 16 figures, 9 tables.

Figures (16)

  • Figure 1: CryoHype architecture. An input image $X_i$ of an unknown structure is first tokenized and concatenated with learnable weight tokens. All tokens are then processed with a transformer encoder, and the output weight tokens are used to modify the weights of an implicit neural representation (INR) that reconstructs the structure $V_i$.
  • Figure 2: Sim2Struct-1000. Example atomic models, density maps, and projected images from Sim2Struct-1000, containing 1000 distinct structures.
  • Figure 3: Qualitative results of Tomotwin-100andSim2Struct-1000. Representative density volumes and the corresponding ground truth volume. Additional examples are given in Figure \ref{['fig:si_tt100']} and Figure \ref{['fig:si_sim2struct1000']} in the Appendix.
  • Figure 4: Per-Image FSC. Each curve shows the average FSC curve across all conformations with error bars indicating the standard deviation. The full FSC curves are shown in Appendix.
  • Figure 5: Latent Visualization for Tomotwin-100 and Sim2Struct-1000. (a) Latent embeddings from cryoDRGN visualized by UMAP and colored by the 10, 100, 200, 500, and 1000 G.T proteins. (b) Latent embeddings for CryoHype.
  • ...and 11 more figures