Table of Contents
Fetching ...

Subhomogeneous Deep Equilibrium Models

Pietro Sittoni, Francesco Tudisco

TL;DR

This work tackles the core challenge of guaranteeing existence and uniqueness of fixed points in implicit-depth networks. It introduces subhomogeneous operators and leverages the Thomson distance from nonlinear Perron–Frobenius theory to prove that, under small subhomogeneity coefficient $0<\mu<1$, positive operators $F$ have a unique fixed point with linear convergence, even for general weight matrices. Building on this, SubDEQ architectures are proposed, where activation subhomogeneity (and optional normalization) ensures well-posed fixed points for both feedforward and graph-based implicit layers, with a power-scaling trick to further relax constraints. Empirical results on image datasets (MNIST, CIFAR-10, SVHN) and graph benchmarks (Cora, CiteSeer, DBLP, PubMed) show SubDEQ matching or surpassing MonDEQ and nODE baselines, while requiring fewer iterations to converge and improving stability via subhomogeneous activations.

Abstract

Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks. We illustrate the performance of the resulting subhomogeneous networks on feedforward, convolutional, and graph neural network examples.

Subhomogeneous Deep Equilibrium Models

TL;DR

This work tackles the core challenge of guaranteeing existence and uniqueness of fixed points in implicit-depth networks. It introduces subhomogeneous operators and leverages the Thomson distance from nonlinear Perron–Frobenius theory to prove that, under small subhomogeneity coefficient , positive operators have a unique fixed point with linear convergence, even for general weight matrices. Building on this, SubDEQ architectures are proposed, where activation subhomogeneity (and optional normalization) ensures well-posed fixed points for both feedforward and graph-based implicit layers, with a power-scaling trick to further relax constraints. Empirical results on image datasets (MNIST, CIFAR-10, SVHN) and graph benchmarks (Cora, CiteSeer, DBLP, PubMed) show SubDEQ matching or surpassing MonDEQ and nODE baselines, while requiring fewer iterations to converge and improving stability via subhomogeneous activations.

Abstract

Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks. We illustrate the performance of the resulting subhomogeneous networks on feedforward, convolutional, and graph neural network examples.
Paper Structure (14 sections, 13 theorems, 88 equations, 2 figures, 8 tables)

This paper contains 14 sections, 13 theorems, 88 equations, 2 figures, 8 tables.

Key Result

Proposition 3.4

Let $F\colon \mathbb{R}^n \to \mathbb{R}^n$ be differentiable and Lipschitz. If $F \in \mathop{\mathrm{\mathrm{s-subhom}}}\nolimits_{\mu}(\mathbb{R}^n_{++})$. Then, $F \in \mathop{\mathrm{\mathrm{subhom}}}\nolimits_{\mu}(\mathbb{R}^n_{++})$ and for all $\lambda \geq 1$ we have Assume moreover that $F'(z)>0$, for all $z>0$, i.e. the Jacobian $F'(z)$ is an entry-wise strictly positive matrix. Then

Figures (2)

  • Figure 1: Iteration required by the fixed point method for SubDEQ vs Peaceman-Rachford method for MonDEQ. Left: linear layer; Right: convolutional layer.
  • Figure 2: Validation accuracy of the dense architectures during training on MNIST.

Theorems & Definitions (27)

  • Definition 3.1: Subhomogeneous operator
  • Example 3.2
  • Remark 3.3
  • Proposition 3.4
  • Theorem 3.5: See e.g. lemmens_nussbaum_2012
  • Theorem 3.6
  • Theorem 3.7
  • Theorem 3.8
  • Theorem 3.9
  • Lemma 4.1
  • ...and 17 more