CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

Jiakai Zhang; Shouchen Zhou; Haizhao Dai; Xinhang Liu; Peihao Wang; Zhiwen Fan; Yuan Pei; Jingyi Yu

CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

Jiakai Zhang, Shouchen Zhou, Haizhao Dai, Xinhang Liu, Peihao Wang, Zhiwen Fan, Yuan Pei, Jingyi Yu

TL;DR

CryoFastAR introduces a feed-forward geometric foundation model for fast ab initio cryo-EM reconstruction by directly predicting relative poses from unordered, noisy particle images. It employs a Vision Transformer–based encoder to extract multi-view features and a cross-attentive decoder to produce Fourier planar maps that encode poses relative to a reference view, enabling efficient Fourier-domain back-projection for reconstruction. The model is trained with a progressive curriculum on a large-scale simulated dataset and fine-tuned on real cryo-EM data, achieving competitive reconstruction quality while delivering substantial speedups over traditional iterative pipelines. This work demonstrates the viability of end-to-end pose prediction in cryo-EM and highlights the potential of geometric foundation models to accelerate and stabilize high-resolution structure determination under challenging imaging conditions.

Abstract

Pose estimation from unordered images is fundamental for 3D reconstruction, robotics, and scientific imaging. Recent geometric foundation models, such as DUSt3R, enable end-to-end dense 3D reconstruction but remain underexplored in scientific imaging fields like cryo-electron microscopy (cryo-EM) for near-atomic protein reconstruction. In cryo-EM, pose estimation and 3D reconstruction from unordered particle images still depend on time-consuming iterative optimization, primarily due to challenges such as low signal-to-noise ratios (SNR) and distortions from the contrast transfer function (CTF). We introduce CryoFastAR, the first geometric foundation model that can directly predict poses from Cryo-EM noisy images for Fast ab initio Reconstruction. By integrating multi-view features and training on large-scale simulated cryo-EM data with realistic noise and CTF modulations, CryoFastAR enhances pose estimation accuracy and generalization. To enhance training stability, we propose a progressive training strategy that first allows the model to extract essential features under simpler conditions before gradually increasing difficulty to improve robustness. Experiments show that CryoFastAR achieves comparable quality while significantly accelerating inference over traditional iterative approaches on both synthetic and real datasets.

CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

TL;DR

Abstract

CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)