Knowledge Distillation of Noisy Force Labels for Improved Coarse-Grained Force Fields
Feranmi V. Olowookere, Sakib Matin, Aleksandra Pachalieva, Nicholas Lubbers, Emily Shinkle
TL;DR
This work tackles the instability and noise in training coarse-grained force fields arising from mapping AA forces to CG representations and entropic contributions by introducing a knowledge distillation (KD) framework. An ensemble of eight CG teacher models is trained on CG-mapped forces to denoise targets, and their outputs (forces and energies) are distilled into a single, fast-to-infer CG student that preserves ensemble-level accuracy. The approach is validated on a deep eutectic solvent, showing that distilling from an ensemble and supervising with per-bead energies markedly improves two-, three-, and many-body structural metrics (RDF, ADF, CDF) while enabling roughly fivefold faster inference than the teacher ensemble. The results suggest this KD workflow can yield accurate, transferable CG force fields suitable for large-scale simulations and could be extended to more complex materials such as polymers.
Abstract
Molecular dynamics simulations are an integral tool for studying the atomistic behavior of materials under diverse conditions. However, they can be computationally demanding in wall-clock time, especially for large systems, which limits the time and length scales accessible. Coarse-grained (CG) models reduce computational expense by grouping atoms into simplified representations commonly termed beads, but sacrifice atomic detail and introduce mapping noise, complicating the training of machine-learned surrogates. Moreover, because CG models inherently include entropic contributions, they cannot be fit directly to all-atom energies, leaving instantaneous, noisy forces as the only state-specific quantities available for training. Here, we apply a knowledge distillation framework by first training an initial CG neural network potential (the teacher) solely on CG-mapped forces to denoise those labels, then distill its force and energy predictions to train refined CG models (the student) in both single- and ensemble-training setups while exploring different force and energy target combinations. We validate this framework on a complex molecular fluid - a deep eutectic solvent - by evaluating two-, three-, and many-body properties and compare the CG and all-atom results. Our findings demonstrate that training a student model on ensemble teacher-predicted forces and per-bead energies improves the quality and stability of CG force fields.
