Mapping Still Matters: Coarse-Graining with Machine Learning Potentials
Franz Görlich, Julija Zavadlav
TL;DR
This work investigates how coarse-graining mappings influence representations learned by equivariant machine learning potentials, using liquid hexane, capped amino acids, and a polyalanine chain. It compares classical CG potentials with ML potentials (MACE), revealing that mapping choices can induce artifacts such as bond-permutation, enantiomer symmetry, and chiral inversion pathways, which limit transferability. The study demonstrates that while ML potentials can learn the potential of mean force for many mappings, preserving topology and avoiding overlapping length scales are crucial for reliable predictions. The findings provide practical guidelines for selecting CG mappings compatible with modern architectures and highlight the need for topology-aware encoding or priors to achieve transferable, physically meaningful CG models with ML methods.
Abstract
Coarse-grained (CG) modeling enables molecular simulations to reach time and length scales inaccessible to fully atomistic methods. For classical CG models, the choice of mapping, that is, how atoms are grouped into CG sites, is a major determinant of accuracy and transferability. At the same time, the emergence of machine learning potentials (MLPs) offers new opportunities to build CG models that can in principle learn the true potential of the mean force for any mapping. In this work, we systematically investigate how the choice of mapping influences the representations learned by equivariant MLPs by studying liquid hexane, amino acids, and polyalanine. We find that when the length scales of bonded and nonbonded interactions overlap, unphysical bond permutations can occur. We also demonstrate that correctly encoding species and maintaining stereochemistry are crucial, as neglecting either introduces unphysical symmetries. Our findings provide practical guidance for selecting CG mappings compatible with modern architectures and guide the development of transferable CG models.
