Generative modeling through internal high-dimensional chaotic activity

Samantha J. Fournier; Pierfrancesco Urbani

Generative modeling through internal high-dimensional chaotic activity

Samantha J. Fournier, Pierfrancesco Urbani

TL;DR

Generative modeling aims to sample from a Boltzmann distribution with energy $E$, typically requiring gradient estimation on the Log-Likelihood ($LL$) that is hampered by slow MCMC mixing. The authors demonstrate that chaotic dynamics in high-dimensional recurrent networks, trained with contrastive Hebbian updates to learn a symmetric perturbation $A$ and biases $b$, can serve as autonomous generative noise. On MNIST and Fashion-MNIST, the 2- and 3-layer restricted models produce plausible samples and achieve competitive statistics via four measures $\mathcal{E}^{(2)}$, $\mathcal{E}^{(s)}$, $\mathcal{E}^{(R)}$, and $\mathcal{E}^{(AAI)}$, with training conducted completely without external noise. The work suggests a path toward biologically inspired generative models and motivates theoretical analyses with dynamical mean-field theory.

Abstract

Generative modeling aims at producing new datapoints whose statistical properties resemble the ones in a training dataset. In recent years, there has been a burst of machine learning techniques and settings that can achieve this goal with remarkable performances. In most of these settings, one uses the training dataset in conjunction with noise, which is added as a source of statistical variability and is essential for the generative task. Here, we explore the idea of using internal chaotic dynamics in high-dimensional chaotic systems as a way to generate new datapoints from a training dataset. We show that simple learning rules can achieve this goal within a set of vanilla architectures and characterize the quality of the generated datapoints through standard accuracy measures.

Generative modeling through internal high-dimensional chaotic activity

TL;DR

Generative modeling aims to sample from a Boltzmann distribution with energy

, typically requiring gradient estimation on the Log-Likelihood (

) that is hampered by slow MCMC mixing. The authors demonstrate that chaotic dynamics in high-dimensional recurrent networks, trained with contrastive Hebbian updates to learn a symmetric perturbation

and biases

, can serve as autonomous generative noise. On MNIST and Fashion-MNIST, the 2- and 3-layer restricted models produce plausible samples and achieve competitive statistics via four measures

, and

, with training conducted completely without external noise. The work suggests a path toward biologically inspired generative models and motivates theoretical analyses with dynamical mean-field theory.

Abstract

Paper Structure (6 sections, 14 equations, 5 figures)

This paper contains 6 sections, 14 equations, 5 figures.

Introduction
Definition of the models
Results
Conclusions
Definition of the deep restricted architecture
Definition of the accuracy indices

Figures (5)

Figure 1: The three different architectures considered in this work: $a)$ Unrestricted architecture, $b)$ Restricted 2-layer architecture and $c)$ Restricted 3-layer architecture. The fields $\bm{b}$, $\bm{c}$ and $\bm{d}$ are not represented. Panel $d)$: Pipeline of training protocol.
Figure 2: Data samples and generated samples from the trained models. Parameters: $N_v=784$ (which is the dimension of each data sample), $N_h=N^{(1)}_h=500$, $N^{(2)}_h=100$, $dt=1$, $\tau=10$, $T=100$, $g=1.5$, $k=0.01$, $M=500$, $N_s=10\,000$. Training lasted $300\,000$ training steps in all cases except for the Deep model trained on FashionMNIST where training lasted $400\,000$ training steps.
Figure 3: Panel $a)$ Samples generated at different times $t$ from the 2-layer Restricted model trained on MNIST (top half) and FashionMNIST (bottom half) with the same parameters as in Fig. \ref{['fig: MNIST and Fashion-MNIST generated samples']}. Panel $b)$ Performance at different times $t$ of the latter model trained on MNIST (blue dots) and FashionMNIST (orange dots) across $4$ accuracy indices (see Appendix \ref{['appendix: accuracy indices']} for their definitions). The lower these indices are, the better the generated samples. All the accuracy indices are computed on a testing dataset composed of $N_s=10\,000$ data samples.
Figure 4: Receptive fields of $9$ randomly chosen hidden neurons in the 2-layer Restricted version trained on MNIST. Note that the overall color scale has been divided by $g$.
Figure 5: Performance of the different models trained on MNIST (top half) and FashionMNIST (bottom half) during training ($t_\mathrm{age}$ is the number of training steps). The accuracy indicators are computed at the target time $T^*=T$.

Generative modeling through internal high-dimensional chaotic activity

TL;DR

Abstract

Generative modeling through internal high-dimensional chaotic activity

Authors

TL;DR

Abstract

Table of Contents

Figures (5)