CryoMAE: Few-Shot Cryo-EM Particle Picking with Masked Autoencoders
Chentianye Xu, Xueying Zhan, Min Xu
TL;DR
CryoMAE tackles the limited-label, low-SNR challenge of cryo-EM particle picking by introducing a two-stage few-shot method that leverages Masked Autoencoders and a novel self-cross similarity loss. Stage 1 learns discriminative particle features from a small set of exemplars and unlabeled regions, with a PU-learning-inspired weighting to handle potential particles in unlabeled data; Stage 2 applies the trained encoder to query micrographs, locating particles via latent-feature cosine similarity to exemplars and a density-based threshold. The approach yields strong improvements over state-of-the-art NN-based methods on CryoPPP, achieving up to $22.4\%$ improvement in 3D reconstruction resolution (average $11.1\%$) while requiring only about 15 exemplars per protein type, significantly reducing labeling burdens. Overall, CryoMAE advances practical cryo-EM analysis by enabling accurate, data-efficient particle picking and more reliable downstream reconstructions.
Abstract
Cryo-electron microscopy (cryo-EM) emerges as a pivotal technology for determining the architecture of cells, viruses, and protein assemblies at near-atomic resolution. Traditional particle picking, a key step in cryo-EM, struggles with manual effort and automated methods' sensitivity to low signal-to-noise ratio (SNR) and varied particle orientations. Furthermore, existing neural network (NN)-based approaches often require extensive labeled datasets, limiting their practicality. To overcome these obstacles, we introduce cryoMAE, a novel approach based on few-shot learning that harnesses the capabilities of Masked Autoencoders (MAE) to enable efficient selection of single particles in cryo-EM images. Contrary to conventional NN-based techniques, cryoMAE requires only a minimal set of positive particle images for training yet demonstrates high performance in particle detection. Furthermore, the implementation of a self-cross similarity loss ensures distinct features for particle and background regions, thereby enhancing the discrimination capability of cryoMAE. Experiments on large-scale cryo-EM datasets show that cryoMAE outperforms existing state-of-the-art (SOTA) methods, improving 3D reconstruction resolution by up to 22.4%.
