Table of Contents
Fetching ...

Computing frustration and near-monotonicity in deep neural networks

Joel Wendin, Erik G. Larsson, Claudio Altafini

TL;DR

The paper investigates whether pretrained deep CNNs encode an intrinsic order by viewing their architectures as signed DAGs and measuring their structural balance through a frustration index. By linking structural balance to monotonicity, it demonstrates that CNNs tend to be less frustrated than null models and therefore exhibit near-monotone input-output behavior, suggesting a form of implicit regularization emerging from training. The authors introduce active-subnetwork analysis, a direct IO-monotonicity test, and a concrete heuristic for estimating frustration, applying them to multiple popular CNNs and null models. The findings reveal a robust, input-sensitive partial order that persists under perturbations and may contribute to the networks' generalization and stability properties. This framework bridges spin-glass inspired disorder analysis with monotone system theory to provide a novel lens on DNN organization and regularization.

Abstract

For the signed graph associated to a deep neural network, one can compute the frustration level, i.e., test how close or distant the graph is to structural balance. For all the pretrained deep convolutional neural networks we consider, we find that the frustration is always less than expected from null models. From a statistical physics point of view, and in particular in reference to an Ising spin glass model, the reduced frustration indicates that the amount of disorder encoded in the network is less than in the null models. From a functional point of view, low frustration (i.e., proximity to structural balance) means that the function representing the network behaves near-monotonically, i.e., more similarly to a monotone function than in the null models. Evidence of near-monotonic behavior along the partial order determined by frustration is observed for all networks we consider. This confirms that the class of deep convolutional neural networks tends to have a more ordered behavior than expected from null models, and suggests a novel form of implicit regularization.

Computing frustration and near-monotonicity in deep neural networks

TL;DR

The paper investigates whether pretrained deep CNNs encode an intrinsic order by viewing their architectures as signed DAGs and measuring their structural balance through a frustration index. By linking structural balance to monotonicity, it demonstrates that CNNs tend to be less frustrated than null models and therefore exhibit near-monotone input-output behavior, suggesting a form of implicit regularization emerging from training. The authors introduce active-subnetwork analysis, a direct IO-monotonicity test, and a concrete heuristic for estimating frustration, applying them to multiple popular CNNs and null models. The findings reveal a robust, input-sensitive partial order that persists under perturbations and may contribute to the networks' generalization and stability properties. This framework bridges spin-glass inspired disorder analysis with monotone system theory to provide a novel lens on DNN organization and regularization.

Abstract

For the signed graph associated to a deep neural network, one can compute the frustration level, i.e., test how close or distant the graph is to structural balance. For all the pretrained deep convolutional neural networks we consider, we find that the frustration is always less than expected from null models. From a statistical physics point of view, and in particular in reference to an Ising spin glass model, the reduced frustration indicates that the amount of disorder encoded in the network is less than in the null models. From a functional point of view, low frustration (i.e., proximity to structural balance) means that the function representing the network behaves near-monotonically, i.e., more similarly to a monotone function than in the null models. Evidence of near-monotonic behavior along the partial order determined by frustration is observed for all networks we consider. This confirms that the class of deep convolutional neural networks tends to have a more ordered behavior than expected from null models, and suggests a novel form of implicit regularization.

Paper Structure

This paper contains 20 sections, 5 theorems, 28 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

A differentiable function $f\, : \, \mathbb{R}^n \to \mathbb{R}^n$ is monotone w.r.t. $\mathbb{S}$ iff $S \frac{\partial f}{\partial \bm{z}} (\bm{z}) S \geq 0$$\forall \, \bm{z} \in \mathbb{R}^n$, i.e., iff $\mathcal{G}\left( \frac{\partial f}{\partial \bm{z}} (\bm{z}) \right)$ is structurally ba

Figures (8)

  • Figure 1: Constructing the adjacency matrix of a CNN. Each layer of the CNN (a) is represented by as sub-diagonal block in the adjacency matrix of the network (b). Convolutional operations (c) are represented as linear transformations (d), which give the weights of the edges in the multibanded Toeplitz adjacency matrix. (e): Heuristic minimization procedure used to compute frustration: choose the row of the adjacency matrix with most negative sum and apply a gauge transformation to it (flipping the sign of all entries in the row and column), until no negative row sum remains.
  • Figure 2: Frustration of the seven pretrained CNNs ($\epsilon_{\rm R}$), their active subnetworks ($\epsilon_{\rm act}$) and their null models ($\epsilon_{\rm N1}$, $\epsilon_{\rm N2}$ and $\epsilon_{\rm N3}$).
  • Figure 3: Histograms of the output alignment fraction $\Omega$ and associated values of $\lambda$, for the real networks (azure) and for two null models (violet and green). For each of the seven CNNs, the histogram of the real network differ significantly from the other two: it is much broader on both sides of $0.5$, showing that $|\Omega -0.5|$ is much larger than expected from null models. For each histogram, the two vertical bars identify the interval of width $2 \lambda$ described in Eq. \ref{['eq:Iomonot2']}. See Fig. \ref{['fig:lambda-panel']} in the SI for details on how to compute $\lambda$.
  • Figure 4: Output alignment fraction $\Omega$ in response to 100 perturbations $\bm{\delta}_{\mathbb{S}_{\bm{x}}}$ of different amplitude for a few images $\bm{x}_1$, as a function of the amplitude $\|\bm{\delta}_{\mathbb{S}_{\bm{x}}} \|$, for each of the seven CNNs. For the real network (azure) the alignment is normally larger than for the two null models (violet and green), and all perturbations for an image (linked by a continuous line) tend to be on the same side of $0.5$, i.e., always increasing or always decreasing.
  • Figure 5: Summary of the results. (a) Frustration computed for all real networks and null models. (b) $\lambda$-monotonicity values for all networks with real weights, re-initialized weights (N3) and real weights but with input perturbation in a random direction.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Proposition 1
  • Theorem 1
  • Lemma 1
  • Theorem 2
  • Corollary 1