Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations
Jiarui Lu, Zuobai Zhang, Bozitao Zhong, Chence Shi, Jian Tang
TL;DR
The paper tackles the costly problem of sampling protein conformations by marrying a zero-shot diffusion sampler (Str2Str) with tractable, parallel short MD simulations. By seeding from the pre-trained sampler, running MD in parallel for each seed, and then fine-tuning the sampler to the target protein (Str2Str-NE and Str2Str-FT), the authors achieve improved ensemble quality within a practical computational budget. The approach demonstrates state-of-the-art performance across multiple metrics on fast-folding proteins, highlighting the value of integrating physics-based refinement with neural samplers. This work provides a scalable framework for energy-aware conformational sampling, bridging rapid neural proposals with local MD equilibration to produce more Boltzmann-like ensembles within affordable compute.
Abstract
The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consuming molecular dynamics (MD) simulations in silico. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). However, being agnostic of the underlying energy landscape, the accuracy of such generative model may still be limited. In this work, we explore the few-shot setting of such pre-trained generative sampler which incorporates MD simulations in a tractable manner. Specifically, given a target protein of interest, we first acquire some seeding conformations from the pre-trained sampler followed by a number of physical simulations in parallel starting from these seeding samples. Then we fine-tuned the generative model using the simulation trajectories above to become a target-specific sampler. Experimental results demonstrated the superior performance of such few-shot conformation sampler at a tractable computational cost.
