Table of Contents
Fetching ...

Inferring effective couplings with Restricted Boltzmann Machines

Aurélien Decelle, Cyril Furtlehner, Alfonso De Jesus Navas Gómez, Beatriz Seoane

TL;DR

This work establishes a principled mapping from Restricted Boltzmann Machines (RBMs) to a generalized Ising model (GIM) with higher-order spin interactions, enabling direct interpretation of learned representations as multi-body couplings. The authors derive explicit formulas to obtain $J^{(n)}$ and $H_j$ from RBM parameters, employing a Gaussian approximation for tractability, and validate the approach through controlled inverse experiments on 1D/2D Ising systems and disordered networks. Compared with prior mappings, the proposed method demonstrates superior accuracy and stability in inferring interaction networks, including higher-order terms, and provides public code for practical use. The study also investigates training dynamics, highlighting the impact of equilibration versus out-of-equilibrium training on the inferred models and outlining limitations and future extensions to non-binary data modalities.

Abstract

Generative models offer a direct way of modeling complex data. Energy-based models attempt to encode the statistical correlations observed in the data at the level of the Boltzmann weight associated with an energy function in the form of a neural network. We address here the challenge of understanding the physical interpretation of such models. In this study, we propose a simple solution by implementing a direct mapping between the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian. This mapping includes interactions of all possible orders, going beyond the conventional pairwise interactions typically considered in the inverse Ising (or Boltzmann Machine) approach, and allowing the description of complex datasets. Earlier works attempted to achieve this goal, but the proposed mappings were inaccurate for inference applications, did not properly treat the complexity of the problem, or did not provide precise prescriptions for practical application. To validate our method, we performed several controlled inverse numerical experiments in which we trained the RBMs using equilibrium samples of predefined models with local external fields, 2-body and 3-body interactions in different sparse topologies. The results demonstrate the effectiveness of our proposed approach in learning the correct interaction network and pave the way for its application in modeling interesting binary variable datasets. We also evaluate the quality of the inferred model based on different training methods.

Inferring effective couplings with Restricted Boltzmann Machines

TL;DR

This work establishes a principled mapping from Restricted Boltzmann Machines (RBMs) to a generalized Ising model (GIM) with higher-order spin interactions, enabling direct interpretation of learned representations as multi-body couplings. The authors derive explicit formulas to obtain and from RBM parameters, employing a Gaussian approximation for tractability, and validate the approach through controlled inverse experiments on 1D/2D Ising systems and disordered networks. Compared with prior mappings, the proposed method demonstrates superior accuracy and stability in inferring interaction networks, including higher-order terms, and provides public code for practical use. The study also investigates training dynamics, highlighting the impact of equilibration versus out-of-equilibrium training on the inferred models and outlining limitations and future extensions to non-binary data modalities.

Abstract

Generative models offer a direct way of modeling complex data. Energy-based models attempt to encode the statistical correlations observed in the data at the level of the Boltzmann weight associated with an energy function in the form of a neural network. We address here the challenge of understanding the physical interpretation of such models. In this study, we propose a simple solution by implementing a direct mapping between the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian. This mapping includes interactions of all possible orders, going beyond the conventional pairwise interactions typically considered in the inverse Ising (or Boltzmann Machine) approach, and allowing the description of complex datasets. Earlier works attempted to achieve this goal, but the proposed mappings were inaccurate for inference applications, did not properly treat the complexity of the problem, or did not provide precise prescriptions for practical application. To validate our method, we performed several controlled inverse numerical experiments in which we trained the RBMs using equilibrium samples of predefined models with local external fields, 2-body and 3-body interactions in different sparse topologies. The results demonstrate the effectiveness of our proposed approach in learning the correct interaction network and pave the way for its application in modeling interesting binary variable datasets. We also evaluate the quality of the inferred model based on different training methods.
Paper Structure (27 sections, 48 equations, 17 figures, 1 table)

This paper contains 27 sections, 48 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: Limitations of the Boltzmann Machine. In this figure, we compare the inferred pairwise interaction coupling matrix and the samples generated by the Boltzmann Machine (BM) and the restricted Boltzmann Machine (RBM) when trained with maximum likelihood on the MNIST dataset lecun1998gradient. The generated samples are obtained by iterating 84 independent Markov Chain Monte Carlo (MCMC) simulations from random initialization until convergence. In A1 we illustrate the architecture of the BM and in A2 the coupling matrix $J$ learned at the end of training. In A3 we show 84 equilibrium samples of the same model. In B1 we show the RBM architecture. In B2 the effective pairwise interactions obtained by mapping the RBM into a generalized Ising model (using Eq. \ref{['2b_formula']}) and 84 RBM equilibrium samples in B3. In C we show 40 examples from MNIST.
  • Figure 2: Pipeline of numerical analysis. We show a sketch of the numerical inverse Ising procedure we use to test our mapping between the RBM and a generalized Ising model (GIM) defined in Eq. \ref{['generalized_ising_hamiltonian-intro']}. First, we generate equilibrium samples with a predefined GIM. Then, we train an RBM with these samples and use the RBM parameters to infer the effective fields and couplings between spins. Finally, we compare the derived couplings to the true couplings used to create the dataset. As an example, we show the comparison of the inferred and the original pairwise coupling matrices in the triangles below and above the diagonal, respectively, in three different inverse Ising experiments where the configurations in the training set were generated at $\beta=0.2$ with the (a) 1D ferromagnetic Ising model, (b) 2D Ising model and (c) a disordered 2D Ising model (the Edwards-Anderson model) containing both positive and negative interactions.
  • Figure 3: RBM effective model for the ferromagnetic 1D Ising model ($L=50, \ \beta=0.2$). (a) Fields, (b) pairwise couplings, (c) the error of the pairwise couplings inferred by the RBM as a function of training time $t$, given in model parameters' update units. In (a) and (b), the solid lines represent the mean of the derived parameters, while the width of the shaded area indicates their standard deviation. In (c), we add an inset showing the inference error as a function of the inverse square root of the training dataset size $M$ at $t=10^6$. (d) Histogram of the inferred fields, pairwise and 3-body couplings and (e) inferred coupling 2-body constants matrices at the end of the training ($t=10^6$). Note in (e) that the RBM-inferred couplings are given in the lower part of the matrices, while the upper part contains the couplings of the ground truth model. The RBMs showed were trained using the PCD-50 scheme, with $N_\mathrm{h} \!=\! 100$ and $\gamma\!=\!0.01$.
  • Figure 4: We compare the histogram of the pairwise couplings inferred by the RBM with those obtained using Belief Propagation (exact in the 1D Ising model in the limit of $M,N\to\infty$) and with an Inverse Ising model (a Boltzmann Machine). Showing a remarkable agreement both in the temperature inferred, and on the width of the peaks. The data set size in these examples was set to $M=10^4$. The RBM showed were trained using the PCD-50 scheme, with $N_\mathrm{h} = 100$. The learning rate was $\gamma = 0.01$ both for the RBM and the BM.
  • Figure 5: For data generated with the ferromagnetic 1D Ising model without external field at two different temperatures ($\beta=1/T=0.4$ on the left and $\beta=0.8$ on the right), we train RBMs with three different sizes of the training set ($M=10^3, 10^4,$ and $10^5$). The colors are determined by these $M$. Top: we show the log-likelihood obtained with AIS krause2020algorithms as a function of training time $t$. In the solid lines we show the log-likelihood of the training set given the model and in the dashed lines we show the log-likelihood for the test set. Bottom: Error on the RBM-inferred pairwise couplings, $\Delta_{J^{(2)}}$ defined in Eq. \ref{['eq:error_couplings_formula']}. These machines were trained using the PCD-50 scheme, with $N_\mathrm{h} = 100$ and $\gamma = 0.1$.
  • ...and 12 more figures