Table of Contents
Fetching ...

Thermodynamic Entropy as Information -- A compression-based demonstration of the Shannon-Boltzmann equivalence in condensed matter

Dallin Fisher, Qi-Jun Hong

TL;DR

The paper addresses computing the thermodynamic entropy of condensed matter directly from atomic configurations without explicit physical partitioning. It introduces ASDF, a compression-based approach that encodes DFT-MD microstates with a K-SVD–style dictionary using ternary coefficients to obtain $H_{\mathrm{coeff}}$, which maps to entropy via $S = \frac{N_A}{N} k_B \ln 2\, H_{\mathrm{coeff}}$, with kinetic contributions canceling in entropy differences. The method reproduces benchmark entropies for metals, semiconductors, oxides, and refractory ceramics in solid and liquid phases, achieving agreement within about $2\ \mathrm{J\,K^{-1}\,mol^{-1}}$, and reveals that information content fully underpins thermodynamic disorder. By tying the reconstruction threshold to the thermal de Broglie wavelength and demonstrating stability across system size and trajectory length, the work provides a parameter-free, general framework for entropy (and free energies) directly from atomic data, unifying information theory and statistical mechanics.

Abstract

We demonstrate that Shannon's information entropy and the thermodynamic entropy of Boltzmann and Gibbs are quantitatively equivalent for real condensed-matter systems. By interpreting atomic configurations as information sources, we compute entropy directly from the compressibility of molecular-dynamics trajectories, without physical partitioning or empirical modeling. A custom lossy-compression algorithm measures the minimum number of bits required to describe a microstate at finite precision, and this bit count maps exactly to thermodynamic entropy through the Shannon-Boltzmann relation. The method reproduces benchmark entropies for metals, semiconductors, oxides, and refractory ceramics in both solid and liquid phases, establishing information as the fundamental quantity underlying thermodynamic disorder. This equivalence unifies information theory and statistical mechanics, providing a general and computationally efficient framework for determining entropies and free energies directly from atomic data.

Thermodynamic Entropy as Information -- A compression-based demonstration of the Shannon-Boltzmann equivalence in condensed matter

TL;DR

The paper addresses computing the thermodynamic entropy of condensed matter directly from atomic configurations without explicit physical partitioning. It introduces ASDF, a compression-based approach that encodes DFT-MD microstates with a K-SVD–style dictionary using ternary coefficients to obtain , which maps to entropy via , with kinetic contributions canceling in entropy differences. The method reproduces benchmark entropies for metals, semiconductors, oxides, and refractory ceramics in solid and liquid phases, achieving agreement within about , and reveals that information content fully underpins thermodynamic disorder. By tying the reconstruction threshold to the thermal de Broglie wavelength and demonstrating stability across system size and trajectory length, the work provides a parameter-free, general framework for entropy (and free energies) directly from atomic data, unifying information theory and statistical mechanics.

Abstract

We demonstrate that Shannon's information entropy and the thermodynamic entropy of Boltzmann and Gibbs are quantitatively equivalent for real condensed-matter systems. By interpreting atomic configurations as information sources, we compute entropy directly from the compressibility of molecular-dynamics trajectories, without physical partitioning or empirical modeling. A custom lossy-compression algorithm measures the minimum number of bits required to describe a microstate at finite precision, and this bit count maps exactly to thermodynamic entropy through the Shannon-Boltzmann relation. The method reproduces benchmark entropies for metals, semiconductors, oxides, and refractory ceramics in both solid and liquid phases, establishing information as the fundamental quantity underlying thermodynamic disorder. This equivalence unifies information theory and statistical mechanics, providing a general and computationally efficient framework for determining entropies and free energies directly from atomic data.

Paper Structure

This paper contains 5 sections, 39 equations, 3 figures.

Figures (3)

  • Figure 1: High level overview of asdf. MD simulations are performed and the atomistic trajectory is reduced to the essential data describing each microstate. This data file, consisting of N atomic positions, is then compressed using a K-SVD–style sparse-dictionary algorithm, producing a compact representation that is converted into a bit string. The length of this bit string corresponds directly to the thermodynamic entropy of the microstate.
  • Figure 2: Threshold-error behavior, stability, and trajectory sensitivity of the ASDF entropy method. (a) Absolute entropy of silicon as a function of $\log(1/\varepsilon)$. In information theory, artificially enlarging the threshold error $\varepsilon$ inflates the inferred information content proportionally to the system's "information dimensionality." Since the thermal de Broglie wavelength provides an approximate scale at which all physically relevant information is captured, the entropy exhibits a linear regime for $\varepsilon < \lambda_{\mathrm{dB}}$ with an effective dimensionality close to $3R$, nearly identical for solid and liquid silicon. Thus, the entropy value nearest the de Broglie wavelength corresponds most closely to classical benchmarks. (b) Same analysis for aluminum, exhibiting analogous behavior. (c) Stability of the absolute entropy of solid and liquid aluminum over an extended MD trajectory. For the ASDF method to be well-defined, the entropy must be insensitive to trajectory length and to the specific segment sampled after an initial relaxation period; the figure confirms this stability. (d) Sensitivity of the algorithm to rare or atypical configurations. During a section of one trajectory, a stacking-fault dislocation temporarily increases the measured entropy. Excluding such rare events yields a clean and consistent bulk entropy value. Together, panels (a--d) demonstrate that the ASDF method is stable with respect to threshold error, trajectory length, and sampling window, while remaining sensitive enough to identify anomalous microstructural events.
  • Figure 3: System-size convergence, threshold-error independence, and agreement with benchmark entropy calculations. (a) Stability of the entropy difference $\Delta S$ for silicon and aluminum as a function of $\log(1/\varepsilon)$. The plateau near the thermal de Broglie wavelength confirms that $\Delta S$ becomes independent of the threshold error once all physically relevant information is resolved. (b) Absolute entropy of solid and liquid aluminum as a function of system size $N$, shown for both ASDF and the benchmark mds method. A minimum system size is required to capture bulk configurational complexity; beyond this, the entropy saturates and remains stable with increasing $N$. Both approaches exhibit a slight upward trend at large $N$, reflecting the increased configurational space sampled in larger systems. (c) Comparison of ASDF and mds entropies for Si, W, Al, and Ti using a tight threshold $\varepsilon = 10^{-4}$ Å. This value minimizes threshold-induced error while remaining above the numerical noise floor of MD trajectories. All materials agree with mds to within $2~\mathrm{J\,K^{-1}\,mol^{-1}}$, demonstrating excellent precision. (d) Absolute solid and liquid entropies for the same four materials using $\varepsilon = \lambda_{\mathrm{dB}}$. ASDF combines coefficient entropy with the electronic entropy, whereas mds sums vibrational, electronic, and configurational contributions. All materials show close agreement between the two methods. Together, panels (a--d) demonstrate independence from threshold error, convergence with system size, and excellent agreement of both $\Delta S$ and absolute $S$ with benchmark calculations.