ATOM: A Pretrained Neural Operator for Multitask Molecular Dynamics
Luke Thompson, Davy Guan, Dai Shi, Slade Matthews, Junbin Gao, Andi Han
TL;DR
ATOM introduces a pretrained neural operator for molecular dynamics that relaxes strict equivariance and autoregressive constraints by using quasi-equivariant lifting and heterogeneous temporal attention, enabling parallel decoding of trajectories across multiple timesteps and molecules. Trained on TG80, a diverse multitask MD dataset, ATOM achieves state-of-the-art single-task performance on MD17, RMD17, and MD22 and shows strong zero-shot generalization to unseen compounds and horizons after multitask pretraining. The combination of time-aware encoding (T-RoPE), label-noise regularization, and point-cloud processing yields robust transfer across chemical space and timescales, highlighting the potential of neural operators for efficient, transferable MD modeling. TG80 provides a scalable, open benchmark for future work, enabling researchers to assess zero-shot transfer and long-horizon accuracy in realistic molecular dynamics scenarios.
Abstract
Molecular dynamics (MD) simulations underpin modern computational drug dis- covery, materials science, and biochemistry. Recent machine learning models provide high-fidelity MD predictions without the need to repeatedly solve quantum mechanical forces, enabling significant speedups over conventional pipelines. Yet many such methods typically enforce strict equivariance and rely on sequential rollouts, thus limiting their flexibility and simulation efficiency. They are also com- monly single-task, trained on individual molecules and fixed timeframes, which restricts generalization to unseen compounds and extended timesteps. To address these issues, we propose Atomistic Transformer Operator for Molecules (ATOM), a pretrained transformer neural operator for multitask molecular dynamics. ATOM adopts a quasi-equivariant design that requires no explicit molecular graph and employs a temporal attention mechanism, allowing for the accurate parallel decod- ing of multiple future states. To support operator pretraining across chemicals and timescales, we curate TG80, a large, diverse, and numerically stable MD dataset with over 2.5 million femtoseconds of trajectories across 80 compounds. ATOM achieves state-of-the-art performance on established single-task benchmarks, such as MD17, RMD17 and MD22. After multitask pretraining on TG80, ATOM shows exceptional zero-shot generalization to unseen molecules across varying time hori- zons. We believe ATOM represents a significant step toward accurate, efficient, and transferable molecular dynamics models
