Energy Loss Functions for Physical Systems
Sékou-Oumar Kaba, Kusha Sareen, Daniel Levy, Siamak Ravanbakhsh
TL;DR
The paper introduces energy loss functions derived from a Boltzmann distribution to embed physical priors directly into ML losses for physical systems in thermal equilibrium. By employing a reverse KL divergence, the loss becomes an energy difference around each data point, yielding physically meaningful gradients and symmetry-respecting training while remaining architecture-agnostic. The framework is instantiated for both atomistic (distance-based pair energies) and discrete spin systems, and extended to diffusion-model training. Empirical results on molecule generation and spin ground-state prediction show improved performance and data efficiency over standard losses, with scalable insights from rigidity theory and invariant loss properties.
Abstract
Effectively leveraging prior knowledge of a system's physics is crucial for applications of machine learning to scientific domains. Previous approaches mostly focused on incorporating physical insights at the architectural level. In this paper, we propose a framework to leverage physical information directly into the loss function for prediction and generative modeling tasks on systems like molecules and spins. We derive energy loss functions assuming that each data sample is in thermal equilibrium with respect to an approximate energy landscape. By using the reverse KL divergence with a Boltzmann distribution around the data, we obtain the loss as an energy difference between the data and the model predictions. This perspective also recasts traditional objectives like MSE as energy-based, but with a physically meaningless energy. In contrast, our formulation yields physically grounded loss functions with gradients that better align with valid configurations, while being architecture-agnostic and computationally efficient. The energy loss functions also inherently respect physical symmetries. We demonstrate our approach on molecular generation and spin ground-state prediction and report significant improvements over baselines.
