Table of Contents
Fetching ...

Typicality of thermal states in isolated quantum systems corresponds to ubiquity of global minima in deep artificial neural networks

Takaaki Monnai

TL;DR

The paper establishes a qualitative correspondence between thermalization in isolated quantum systems and overparameterized neural networks governed by the Neural Tangent Kernel (NTK). By focusing on a restricted set of observables and exploiting a Wishart-type matrix structure, it shows that the ubiquity of global minima in NTK resembles the typicality of pure thermal states, with reduced-density-matrix statistics aligning with microcanonical predictions when subsystems are small. It also ties the system-size dependence of entanglement and state distinguishability to the double descent phenomenon through the NTK's eigenstructure and a fitting threshold $P=\kappa n_L N$, illustrating how overparameterization yields degenerate function outputs and robustness in learning. The work provides a framework for connecting concepts such as typicality, global minima, entanglement, and double descent across quantum statistical mechanics and deep learning, offering a step toward a unified theoretical perspective.

Abstract

The Neural Tangent Kernel theory theoretically guarantees the existence of a global minima of the cost function in the neighborhood of an arbitrary random initialization in deep artificial neural networks. In this paper, we show that the ubiquity of the global minima directly corresponds to the typicality of pure thermal states in isolated quantum systems by showing a common underlying mechanism, involving a few observables and the role of a Wishart-type matrix. Moreover, we demonstrate that the increase in distinguishability of the reduced density matrices of typical pure states with subsystem size corresponds to the double descent phenomenon observed by varying the width of layers in finite-width artificial neural networks.

Typicality of thermal states in isolated quantum systems corresponds to ubiquity of global minima in deep artificial neural networks

TL;DR

The paper establishes a qualitative correspondence between thermalization in isolated quantum systems and overparameterized neural networks governed by the Neural Tangent Kernel (NTK). By focusing on a restricted set of observables and exploiting a Wishart-type matrix structure, it shows that the ubiquity of global minima in NTK resembles the typicality of pure thermal states, with reduced-density-matrix statistics aligning with microcanonical predictions when subsystems are small. It also ties the system-size dependence of entanglement and state distinguishability to the double descent phenomenon through the NTK's eigenstructure and a fitting threshold , illustrating how overparameterization yields degenerate function outputs and robustness in learning. The work provides a framework for connecting concepts such as typicality, global minima, entanglement, and double descent across quantum statistical mechanics and deep learning, offering a step toward a unified theoretical perspective.

Abstract

The Neural Tangent Kernel theory theoretically guarantees the existence of a global minima of the cost function in the neighborhood of an arbitrary random initialization in deep artificial neural networks. In this paper, we show that the ubiquity of the global minima directly corresponds to the typicality of pure thermal states in isolated quantum systems by showing a common underlying mechanism, involving a few observables and the role of a Wishart-type matrix. Moreover, we demonstrate that the increase in distinguishability of the reduced density matrices of typical pure states with subsystem size corresponds to the double descent phenomenon observed by varying the width of layers in finite-width artificial neural networks.

Paper Structure

This paper contains 4 sections, 4 equations, 1 table.