Learning Biomolecular Motion: The Physics-Informed Machine Learning Paradigm
Aaryesh Deshpande
TL;DR
This work surveys physics-informed machine learning (PIML) for biomolecular dynamics, arguing that integrating physical laws with data can overcome sampling and accuracy limits of classical MD. It organizes methods into PINNs, neural operators, differentiable simulation, and hybrid closures, showing how each enforces thermodynamic consistency, detailed balance, and variational optimality to model long-timescale kinetics and rare events. Key contributions include a taxonomy of frameworks, discussion of differentiable MD engines, and application-oriented insights for free-energy learning, folding, and binding, along with practical limitations and mitigation strategies. The authors forecast a near-term shift toward differentiable, uncertainty-aware, and mechanistically biased models—potentially enabling end-to-end learning from experiment to design while preserving physical interpretability and transferability across thermodynamic conditions.
Abstract
The convergence of statistical learning and molecular physics is transforming our approach to modeling biomolecular systems. Physics-informed machine learning (PIML) offers a systematic framework that integrates data-driven inference with physical constraints, resulting in models that are accurate, mechanistic, generalizable, and able to extrapolate beyond observed domains. This review surveys recent advances in physics-informed neural networks and operator learning, differentiable molecular simulation, and hybrid physics-ML potentials, with emphasis on long-timescale kinetics, rare events, and free-energy estimation. We frame these approaches as solutions to the "biomolecular closure problem", recovering unresolved interactions beyond classical force fields while preserving thermodynamic consistency and mechanistic interpretability. We examine theoretical foundations, tools and frameworks, computational trade-offs, and unresolved issues, including model expressiveness and stability. We outline prospective research avenues at the intersection of machine learning, statistical physics, and computational chemistry, contending that future advancements will depend on mechanistic inductive biases, and integrated differentiable physical learning frameworks for biomolecular simulation and discovery.
