Table of Contents
Fetching ...

Who Said Neural Networks Aren't Linear?

Nimrod Berman, Assaf Hallak, Assaf Shocher

TL;DR

A framework that makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings and shows that the composition of two Linearizers that share a neural network is also a Linearizer.

Abstract

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$\to$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.

Who Said Neural Networks Aren't Linear?

TL;DR

A framework that makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings and shows that the composition of two Linearizers that share a neural network is also a Linearizer.

Abstract

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, . Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator between two invertible neural networks, , then the corresponding vector spaces and are induced by newly defined addition and scaling actions derived from and . We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. ) on networks leading to a globally projective generative model and to demonstrate modular style transfer.

Paper Structure

This paper contains 55 sections, 8 theorems, 45 equations, 9 figures, 4 tables.

Key Result

Lemma 1

$(V,\oplus_g,\odot_g)$ is a vector space over $\mathbb{R}$. Proof. See Appendix. app:axioms.

Figures (9)

  • Figure 1: Left. The Linearizer structure (top) is a linear operation sandwiched between two invertible functions (bottom) Right. Vector addition and scalar multiplication define induced vector spaces for which $f$ is linear.
  • Figure 2: Comparison between multi-step and one-step flow matching. Panel labels: (a) multi-step Linear FM. (b) one-step Linear flow matching (FM);
  • Figure 3: One-step inversion examples: Left and right (in red): original (not generated) data $x_1$ and $x_2$. Intermediate images obtained by latent interpolation. See Appendix. \ref{['app:interp']} for more and higher-resolution results.
  • Figure 4: Quantitative comparisons.
  • Figure 5: Style transfer examples. Left: original image. Middle: style transfer using the left-side and right-side style images. Right: interpolation between the two styles.
  • ...and 4 more figures

Theorems & Definitions (20)

  • Definition 1: Linearizer
  • Definition 2: Induced Vector Space Operations
  • Lemma 1
  • Lemma 2
  • proof
  • Lemma 3
  • Definition 3: Induced Inner Product
  • Lemma 4: Induced spaces are Hilbert
  • Lemma 5: Transpose
  • proof
  • ...and 10 more