Probing the effects of broken symmetries in machine learning

Marcel F. Langer; Sergey N. Pozdnyakov; Michele Ceriotti

Probing the effects of broken symmetries in machine learning

Marcel F. Langer, Sergey N. Pozdnyakov, Michele Ceriotti

TL;DR

It is demonstrated that an unconstrained architecture can be trained to achieve a high degree of rotational invariance, and the impacts of the small symmetry breaking are tested in realistic scenarios involving simulations of gas-phase, liquid, and solid water.

Abstract

Symmetry is one of the most central concepts in physics, and it is no surprise that it has also been widely adopted as an inductive bias for machine-learning models applied to the physical sciences. This is especially true for models targeting the properties of matter at the atomic scale. Both established and state-of-the-art approaches, with almost no exceptions, are built to be exactly equivariant to translations, permutations, and rotations of the atoms. Incorporating symmetries -- rotations in particular -- constrains the model design space and implies more complicated architectures that are often also computationally demanding. There are indications that non-symmetric models can easily learn symmetries from data, and that doing so can even be beneficial for the accuracy of the model. We put a model that obeys rotational invariance only approximately to the test, in realistic scenarios involving simulations of gas-phase, liquid, and solid water. We focus specifically on physical observables that are likely to be affected -- directly or indirectly -- by symmetry breaking, finding negligible consequences when the model is used in an interpolative, bulk, regime. Even for extrapolative gas-phase predictions, the model remains very stable, even though symmetry artifacts are noticeable. We also discuss strategies that can be used to systematically reduce the magnitude of symmetry breaking when it occurs, and assess their impact on the convergence of observables.

Probing the effects of broken symmetries in machine learning

TL;DR

Abstract

Paper Structure (2 sections, 1 equation, 3 figures)

This paper contains 2 sections, 1 equation, 3 figures.

Figures (3)

Figure 1: Simulations of a water molecule using a rotationally non-equivariant model. (a) Trajectories of the angular momentum components (dashed lines) and modulus (full line) during constant-energy molecular dynamics, for the model without symmetrization (red) and with rotational averaging over a $2\mathrm{i}$ grid. (b) Mean value of the torque acting on the molecule over a constant-temperature simulation, for different orientation grids (using the notation $N[\mathrm{i}]$). (c) Power spectrum computed from the autocorrelation function of the potential energy, and on the non-equivariant part of the potential $\Delta$ (computed as the difference between the raw model and a $2\mathrm{i}$ average). (d-e) Orientational free energy for the water molecule computed over a long constant-temperature simulation without (d) and with $2\mathrm{i}$ rotational averaging (e).
Figure 2: Structural properties of liquid water at $T=300$ K, simulated with a model with and without $2\mathrm{i}$ rotational averaging. (a-b) Orientational free energy for the water molecule computed over a long constant-temperature simulation without (a) and with (b) $2\mathrm{i}$ averaging . (c) O-O pair correlation function. (d) Molecular orientation correlation function, computed separately for the longitudinal (full lines) and transverse (dashed lines) components.
Figure 3: Dynamical properties of liquid water at $T=300$ K, simulated with a model with and without a $2\mathrm{i}$ rotational averaging. The shaded area around the curves indicates the (small) statistical uncertainty. (a) Oxygen mean-square displacement curves, whose slope is proportional to the diffusion coefficient. (b) Dipole autocorrelation function, which is indicative of the rotational dynamics of water molecules.