Table of Contents
Fetching ...

A spectral approach to Hebbian-like neural networks

Elena Agliari, Domenico Luongo, Alberto Fachechi

TL;DR

The paper investigates spectral properties of dreaming Hopfield networks built from corrupted exemplars, comparing supervised and unsupervised training; it derives exact limiting eigenvalue distributions in the thermodynamic limit and shows how dreaming time $t$ morphs the spectrum from a Marchenko-Pastur mixture toward a projector-like form, with direct implications for retrieval. Through a Gaussian approximation, it provides explicit expressions for 1-step retrieval metrics (Mattis magnetization) and attractiveness, linking them to the spectrum via integrals over the limiting distributions. The results reveal that dreaming enhances retrieval in basic storing and supervised settings but can impair generalization in unsupervised learning when data quality or quantity is insufficient, offering a spectral lens on retrieval and learning dynamics in Hebbian-like networks under data corruption and consolidation dynamics.

Abstract

We consider the Hopfield neural network as a model of associative memory and we define its neuronal interaction matrix $\mathbf{J}$ as a function of a set of $K \times M$ binary vectors $\{\mathbfξ^{μ, A} \}_{μ=1,...,K}^{A=1,...,M}$ representing a sample of the reality that we want to retrieve. In particular, any item $\mathbfξ^{μ, A}$ is meant as a corrupted version of an unknown ground pattern $\mathbfζ^μ$, that is the target of our retrieval process. We consider and compare two definitions for $\mathbf{J}$, referred to as supervised and unsupervised, according to whether the class $μ$, each example belongs to, is unveiled or not, also, these definitions recover the paradigmatic Hebb's rule under suitable limits. The spectral properties of the resulting matrices are studied and used to inspect the retrieval capabilities of the related models as a function of their control parameters.

A spectral approach to Hebbian-like neural networks

TL;DR

The paper investigates spectral properties of dreaming Hopfield networks built from corrupted exemplars, comparing supervised and unsupervised training; it derives exact limiting eigenvalue distributions in the thermodynamic limit and shows how dreaming time morphs the spectrum from a Marchenko-Pastur mixture toward a projector-like form, with direct implications for retrieval. Through a Gaussian approximation, it provides explicit expressions for 1-step retrieval metrics (Mattis magnetization) and attractiveness, linking them to the spectrum via integrals over the limiting distributions. The results reveal that dreaming enhances retrieval in basic storing and supervised settings but can impair generalization in unsupervised learning when data quality or quantity is insufficient, offering a spectral lens on retrieval and learning dynamics in Hebbian-like networks under data corruption and consolidation dynamics.

Abstract

We consider the Hopfield neural network as a model of associative memory and we define its neuronal interaction matrix as a function of a set of binary vectors representing a sample of the reality that we want to retrieve. In particular, any item is meant as a corrupted version of an unknown ground pattern , that is the target of our retrieval process. We consider and compare two definitions for , referred to as supervised and unsupervised, according to whether the class , each example belongs to, is unveiled or not, also, these definitions recover the paradigmatic Hebb's rule under suitable limits. The spectral properties of the resulting matrices are studied and used to inspect the retrieval capabilities of the related models as a function of their control parameters.
Paper Structure (11 sections, 5 theorems, 34 equations, 5 figures)

This paper contains 11 sections, 5 theorems, 34 equations, 5 figures.

Key Result

Lemma 1

The following results hold:

Figures (5)

  • Figure 1: Limiting spectral distributions of the couplings matrix. The figure shows the probability distribution $P(\lambda)= \frac{d \mu}{d \lambda}$\ref{['eq:pdfs']} in the three settings under consideration: basic storing (first row), supervised (second row) and unsupervised (third row) cases. In the first row, we plotted the spectral distribution for various values of $\alpha$ and $t$, while in the supervised and unsupervised setting we fixed $\alpha=0.1$ and vary $t$ and $r$. The vertical arrows (whose heights are arbitrary) refer to the location of the $\delta$-peak: in the basic storing and supervised cases, the location is at $\lambda=0$ (as $\lambda_{peak} =0$), while in the unsupervised setting it depends on $\alpha$, $t$ and $r$, as foreseen by Thm. \ref{['thm:1']}.
  • Figure 2: Squared error for supervised and unsupervised settings. The figure shows the comparison between numerical results for the SE \ref{['eq:exdeltaM']} at finite $M$ and the theoretical prediction for $M\to\infty$ in the thermodynamic limit as a function of $r$ for various values of $\alpha$ and $t$. The first row refers to the supervised setting, while the second line shows the results for the unsupervised case. For fixed $t$, each plot exhibits the results for $\alpha=0.1$ (solid black curve), $\alpha=0.2$ (dashed black curve) and $\alpha=0.3$ (dotted black curve), while the markers refer to $M=50,100,200$. The network size is fixed to $N=1000$ in all cases.
  • Figure 3: Stability and attractiveness of patterns in the basic storing setting. The figure shows a comparison between the theoretical predictions of stability (upper left plot) and attractiveness (other plots). In the former case, the 1-step magnetization $m_1$ starting from one of the patterns ($p=1$) is given as a function of $\alpha$, while for the attractiveness we fixed $\alpha=0.1,0.2,0.3$ and $t=0,10$ (resp. Hebbian and large dreaming time limit) and consider $m_1$ as a function of the noise level $p$ of the starting configuration, as explained in Rem. \ref{['rem:noisyinit']}. In these plots, the dashed lines is the identity function $m_1(p)=m_0(p)=p.$ In the numerical simulations, we averaged over 100 different realizations of the patterns for systems with fixed size $N=5000$. In these plots, $m_1$ stands for $m^{(1)}.$
  • Figure 4: Attractiveness of ground-truths in the supervised and unsupervised settings. The plot shows a comparison between the theoretical predictions (given by Prop. \ref{['prop:un_sup']}) of the attractiveness of the ground truths and the numerical results for the supervised (first row) and unsupervised (second row) settings. Numerical results are averaged over 100 different realization of $M=1000$ examples by varying $\alpha$ and $r$, and 100 different realization of the initial conditions. In this case, initial conditions are testing examples, i.e. examples with the same satistics as the the training points, but which are not stored as fixed points. The system size is fixed to $N=1000.$ In the plots, $m_1$ stands for $m^{(1)}.$
  • Figure 5: Schematic representation of attractors in the basic storing setting. The figure shows a pictorial representation of fixed points in the Hopfield model (left) and dreaming model (right) at large dreaming time $t\gg1$. For large $\alpha$ (above the critical storace capacity $\alpha_c=0.14$ for the Hopfield model), fixed points are the balls centered in the pattern with Hamming radius $R(p^*) =\frac{N}{2}(1-p^*)$, while in the dreaming model (for large but low enough $t$) patterns are fixed points. At $t\to\infty$, in the dreaming model at $\alpha\le 1$ patterns are always stable configurations for the neural dynamics.

Theorems & Definitions (22)

  • Remark 1
  • Definition 1: Thermodynamic limit
  • Lemma 1
  • Remark 2
  • Theorem 1
  • Remark 3
  • Definition 2
  • Proposition 1
  • Definition 3
  • Definition 4
  • ...and 12 more