Scale redundancy and soft gauge fixing in positively homogeneous neural networks

Rodrigo Carmo Terin

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

Rodrigo Carmo Terin

TL;DR

A structural link is established between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Abstract

Neural networks with positively homogeneous activations exhibit an exact continuous reparametrization symmetry: neuron-wise rescalings generate parameter-space orbits along which the input--output function is invariant. We interpret this symmetry as a gauge redundancy and introduce gauge-adapted coordinates that separate invariant and scale-imbalance directions. Inspired by gauge fixing in field theory, we introduce a soft orbit-selection (norm-balancing) functional acting only on redundant scale coordinates. We show analytically that it induces dissipative relaxation of imbalance modes to preserve the realized function. In controlled experiments, this orbit-selection penalty expands the stable learning-rate regime and suppresses scale drift without changing expressivity. These results establish a structural link between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

TL;DR

A structural link is established between gauge-orbit geometry and optimization conditioning, providing a concrete connection between gauge-theoretic concepts and machine learning.

Abstract

Paper Structure (14 sections, 50 equations, 5 figures, 2 tables)

This paper contains 14 sections, 50 equations, 5 figures, 2 tables.

Introduction
Reparametrization Symmetry as Gauge Redundancy
Gauge Fixing by Norm Balancing
Gradient-flow dynamics in gauge coordinates
Experiments
Discussion and Interdisciplinary Perspective
Conclusion
Derivation of the gradient-flow equations in gauge coordinates
Preliminaries and notation
Gradients of the logarithmic norm coordinates
Gradients of the gauge-fixing functional
Gauge-only contribution to the block dynamics
Dynamics of $\alpha_i$ and $\beta_i$
Closed dynamics for the gauge coordinate $v_i$

Figures (5)

Figure 1: Gauge redundancy in parameter space. Neuron-wise positive rescalings generate continuous gauge orbits.
Figure 2: Toy regression dataset used in Sec. \ref{['sec:experiments']}. Training samples are drawn from $[-2,2]$ with additive Gaussian noise. The black curve shows the noiseless target function evaluated on the validation grid over $[-3,3]$.
Figure 3: Validation mean-squared error as a function of the gauge-fixing strength $\lambda$. Each point corresponds to the mean over independent random seeds, with error bars given by the standard deviation.
Figure 4: Learning-rate stress test comparing baseline and gauge-fixed training ($\lambda=0.2$). Gauge fixing expands the empirically stable learning-rate regime in this toy setup.
Figure 5: Functional invariance under random gauge transformations. Shown is a boxplot of the invariance error $\Delta_{\mathrm{inv}}$ defined in Eq. \ref{['eq:invariance_error']} across 200 random neuron-wise rescalings with $s_i>0$. All values remain at double-precision numerical accuracy (min $3.99\times 10^{-18}$, median $6.68\times 10^{-18}$, max $1.14\times 10^{-17}$), confirming that the gauge transformations leave the network function unchanged up to machine precision.

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

TL;DR

Abstract

Scale redundancy and soft gauge fixing in positively homogeneous neural networks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)