Improving AlphaFlow for Efficient Protein Ensembles Generation
Shaoning Li, Mingyu Li, Yusong Wang, Xinheng He, Nanning Zheng, Jian Zhang, Pheng-Ann Heng
TL;DR
This work tackles the high computational cost of generating protein conformational ensembles with flow-based approaches. It introduces AlphaFlow-Lit, a feature-conditioned, light-weight variant that freezes the Evoformer and relies on precomputed single/pair features to accelerate sampling, achieving approximately $47\times$ speedup while maintaining performance comparable to AlphaFlow. The authors validate the method on ALTAS MD trajectories, showing that AlphaFlow-Lit preserves essential and global dynamics similar to MD and outperforms the distilled variant in diversity and correlation metrics, while offering superior runtime scalability. The approach significantly enhances the practicality of dense protein ensemble generation, enabling faster exploration of conformational landscapes and enabling large-scale analyses of dynamics and long-range couplings with deep learning tools.
Abstract
Investigating conformational landscapes of proteins is a crucial way to understand their biological functions and properties. AlphaFlow stands out as a sequence-conditioned generative model that introduces flexibility into structure prediction models by fine-tuning AlphaFold under the flow-matching framework. Despite the advantages of efficient sampling afforded by flow-matching, AlphaFlow still requires multiple runs of AlphaFold to finally generate one single conformation. Due to the heavy consumption of AlphaFold, its applicability is limited in sampling larger set of protein ensembles or the longer chains within a constrained timeframe. In this work, we propose a feature-conditioned generative model called AlphaFlow-Lit to realize efficient protein ensembles generation. In contrast to the full fine-tuning on the entire structure, we focus solely on the light-weight structure module to reconstruct the conformation. AlphaFlow-Lit performs on-par with AlphaFlow and surpasses its distilled version without pretraining, all while achieving a significant sampling acceleration of around 47 times. The advancement in efficiency showcases the potential of AlphaFlow-Lit in enabling faster and more scalable generation of protein ensembles.
