HEroBM: a deep equivariant graph neural network for universal backmapping from coarse-grained to all-atom representations
Daniele Angioletti, Stefano Raniolo, Vittorio Limongelli
TL;DR
Coarse-grained (CG) simulations enable large-scale, long-timescale molecular modeling but lose atomistic detail, complicating the recovery of full atomistic structures. HEroBM introduces a universal, locality-driven backmapping framework based on SE(3)-equivariant graph neural networks and a hierarchical anchor scheme to reconstruct atomistic coordinates from any CG mapping. Across proteins, lipids, and small molecules, HEroBM achieves high reconstruction fidelity with substantially less training data than competing ML-based methods, and demonstrates scalability to large systems and real CG trajectories, including GPCRs in lipid bilayers with bound ligands. The framework supports end-to-end backmapping and integrates with energy minimisation and MD workflows, offering a practical, adaptable tool (potentially via a webserver) to enable accurate atomistic restoration from coarse-grained simulations.
Abstract
Molecular simulations have assumed a paramount role in the fields of chemistry, biology, and material sciences, being able to capture the intricate dynamic properties of systems. Within this realm, coarse-grained (CG) techniques have emerged as invaluable tools to sample large-scale systems and reach extended timescales by simplifying system representation. However, CG approaches come with a trade-off: they sacrifice atomistic details that might hold significant relevance in deciphering the investigated process. Therefore, a recommended approach is to identify key CG conformations and process them using backmapping methods, which retrieve atomistic coordinates. Currently, rule-based methods yield subpar geometries and rely on energy relaxation, resulting in less-than-optimal outcomes. Conversely, machine learning techniques offer higher accuracy but are either limited in transferability between systems or tied to specific CG mappings. In this work, we introduce HEroBM, a dynamic and scalable method that employs deep equivariant graph neural networks and a hierarchical approach to achieve high-resolution backmapping. HEroBM handles any type of CG mapping, offering a versatile and efficient protocol for reconstructing atomistic structures with high accuracy. Focused on local principles, HEroBM spans the entire chemical space and is transferable to systems of varying sizes. We illustrate the versatility of our framework through diverse biological systems, including a complex real-case scenario. Here, our end-to-end backmapping approach accurately generates the atomistic coordinates of a G protein-coupled receptor bound to an organic small molecule within a cholesterol/phospholipid bilayer.
