Table of Contents
Fetching ...

Machine Learning H-theorem

Ruben Lier

TL;DR

This work tackles the problem of extracting the thermodynamic arrow of time from chaotic many-particle dynamics by learning the $H$-functional underlying the Boltzmann equation. It introduces a permutation-invariant DeepSets-based neural network to produce a time-increasing function $h(\mathbf{V}_t)$ that approximates the $H$-functional $H(t)=\int d\mathbf{v}\, f(\mathbf{v},t)\log f(\mathbf{v},t)$ up to an affine transform, trained with a siamese-like loss that enforces nonincrease of $H$ over time (via $h(\mathbf{V}_t)-h(\mathbf{V}_{t+1})$) and a regularization term. The authors demonstrate that, after affine alignment, the learned function can resemble the true $H$-functional, with best stability and consistency for an intermediate hidden size and a leaky-ReLU loss variant that reduces late-time oscillations. This approach highlights how structured neural architectures and loss design can reveal irreversibility signals in microscopic chaotic data and points toward extensions to more complex collisional systems and active matter.

Abstract

H-theorem provides a microscopic foundation of the Second Law of Thermodynamics and is therefore essential to establishing statistical physics, but at the same time, H-theorem has been subject to controversy that in part persists till this day. To better understand H-theorem and its relation to the arrow of time, we study the equilibration of randomly oriented and positioned hard disks with periodic boundary conditions. Using a model based on the DeepSets architecture, which imposes permutation invariance of the particle labels, we train a model to capture the irreversibility of the H-functional.

Machine Learning H-theorem

TL;DR

This work tackles the problem of extracting the thermodynamic arrow of time from chaotic many-particle dynamics by learning the -functional underlying the Boltzmann equation. It introduces a permutation-invariant DeepSets-based neural network to produce a time-increasing function that approximates the -functional up to an affine transform, trained with a siamese-like loss that enforces nonincrease of over time (via ) and a regularization term. The authors demonstrate that, after affine alignment, the learned function can resemble the true -functional, with best stability and consistency for an intermediate hidden size and a leaky-ReLU loss variant that reduces late-time oscillations. This approach highlights how structured neural architectures and loss design can reveal irreversibility signals in microscopic chaotic data and points toward extensions to more complex collisional systems and active matter.

Abstract

H-theorem provides a microscopic foundation of the Second Law of Thermodynamics and is therefore essential to establishing statistical physics, but at the same time, H-theorem has been subject to controversy that in part persists till this day. To better understand H-theorem and its relation to the arrow of time, we study the equilibration of randomly oriented and positioned hard disks with periodic boundary conditions. Using a model based on the DeepSets architecture, which imposes permutation invariance of the particle labels, we train a model to capture the irreversibility of the H-functional.

Paper Structure

This paper contains 8 sections, 18 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Depiction of an elastic collision between two hard disks.
  • Figure 2: Number of collisions that are observed in a single run averaged over 5 runs vs the frame rate. The frame rate is the inverse of the time step $d t$ of the simulation.
  • Figure 3: Picture of theoretically predicted Maxwell-Boltzmann distribution together with average and final frame of hard disk simulation for the four different runs that are also considered in Fig. \ref{['figsact123123']}. For constructing the histograms, we used bin size 0.005. The uniform velocity magnitude at which the disks are initialized is given by $v = 0.12$, which means the Maxwell-Boltzmann distribution whose expression is given by \ref{['eq:maxwellboltzmann']} has temperature ${k_\text{b}} T = 0.0076$ and mass $m=1$. The averaging started after 20% of the run was completed.
  • Figure 4: Schematic picture of the model output combined with that of the next time step to form a loss consistent with a siamese neural network where the inputs of the two time steps are processed with the same model which is of the form of \ref{['model']}. We considered the case of 3 input vectors $\mathbf{v}^{(i)}_t$ of a single run $n$ whose label we omitted to avoid clutter. For the picture we took $n_{\text{hidden}} = 3$ whereas in reality the $n_{\text{hidden}}$ is as in \ref{['eq:hiddentrain']}.
  • Figure 5: Picture of $H$-functional given by \ref{['eq:Hfunctional']} vs affinely fitted model output. When computing the $H$-functional, we used the same bin size that was also used in Fig. \ref{['figsimpleimpact123']} to perform a Riemann sum.