Table of Contents
Fetching ...

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

Rodrigo Carmo Terin

TL;DR

A structural link is established between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Abstract

Neural networks with positively homogeneous activations exhibit an exact continuous reparametrization symmetry: neuron-wise rescalings generate parameter-space orbits along which the input--output function is invariant. We interpret this symmetry as a gauge redundancy and introduce gauge-adapted coordinates that separate invariant and scale-imbalance directions. Inspired by gauge fixing in field theory, we introduce a soft orbit-selection (norm-balancing) functional acting only on redundant scale coordinates. We show analytically that it induces dissipative relaxation of imbalance modes to preserve the realized function. In controlled experiments, this orbit-selection penalty expands the stable learning-rate regime and suppresses scale drift without changing expressivity. These results establish a structural link between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

TL;DR

A structural link is established between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Abstract

Neural networks with positively homogeneous activations exhibit an exact continuous reparametrization symmetry: neuron-wise rescalings generate parameter-space orbits along which the input--output function is invariant. We interpret this symmetry as a gauge redundancy and introduce gauge-adapted coordinates that separate invariant and scale-imbalance directions. Inspired by gauge fixing in field theory, we introduce a soft orbit-selection (norm-balancing) functional acting only on redundant scale coordinates. We show analytically that it induces dissipative relaxation of imbalance modes to preserve the realized function. In controlled experiments, this orbit-selection penalty expands the stable learning-rate regime and suppresses scale drift without changing expressivity. These results establish a structural link between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.
Paper Structure (14 sections, 50 equations, 5 figures, 2 tables)

This paper contains 14 sections, 50 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Gauge redundancy in parameter space. Neuron-wise positive rescalings generate continuous gauge orbits.
  • Figure 2: Toy regression dataset used in Sec. \ref{['sec:experiments']}. Training samples are drawn from $[-2,2]$ with additive Gaussian noise. The black curve shows the noiseless target function evaluated on the validation grid over $[-3,3]$.
  • Figure 3: Validation mean-squared error as a function of the gauge-fixing strength $\lambda$. Each point corresponds to the mean over independent random seeds, with error bars given by the standard deviation.
  • Figure 4: Learning-rate stress test comparing baseline and gauge-fixed training ($\lambda=0.2$). Gauge fixing expands the empirically stable learning-rate regime in this toy setup.
  • Figure 5: Functional invariance under random gauge transformations. Shown is a boxplot of the invariance error $\Delta_{\mathrm{inv}}$ defined in Eq. \ref{['eq:invariance_error']} across 200 random neuron-wise rescalings with $s_i>0$. All values remain at double-precision numerical accuracy (min $3.99\times 10^{-18}$, median $6.68\times 10^{-18}$, max $1.14\times 10^{-17}$), confirming that the gauge transformations leave the network function unchanged up to machine precision.