On a group of invariances in a class of functions

Shravan Mohan

On a group of invariances in a class of functions

Shravan Mohan

TL;DR

This work analyzes a class of compositional models built from alternating polynomial layers and rectified monomial activations, revealing a concrete invariance group generated by input linear reparameterizations and inter-layer permutation/diagonal scalings. It provides a constructive description of how these invariances can be exploited for canonical representations, efficient regularized optimization via geometric programming, and parameter range minimization to improve quantization and robustness. The authors develop obfuscation protocols for private inference and secure remote training, leveraging the invariance structure to hide inputs and parameters while preserving functional behavior. They also extend the invariance analysis to self-attention, identifying a bilinear query–key invariance and a linear value–output invariance, along with permutation equivariance of Pre-LN Transformer blocks. Overall, the work offers a principled toolkit for model compression, privacy-preserving computation, and deeper understanding of nonlinear compositional architectures.

Abstract

A class of parametric functions formed by alternating compositions of multivariate polynomials and rectification style monomial maps is studied (the layer-wise exponents are treated as fixed hyperparameters and are not optimized). For this family, nontrivial parametric invariances are identified and characterized, i.e., distinct parameter settings that induce identical input-output maps. A constructive description of the invariance structure is provided, enabling sparse function representations, parameter obfuscation, and potential dimensionality reduction for optimization.

On a group of invariances in a class of functions

TL;DR

Abstract

Paper Structure (31 sections, 3 theorems, 88 equations)

This paper contains 31 sections, 3 theorems, 88 equations.

Introduction
The Invariance Group
Invariance Aware Optimization
Parameter Obfuscation
Parameter Range Minimization
Setup and Notation
Positive/Negative Partition and Layerwise Bounds
Optimization Problem
Convexity and Geometric-Program Structure
Remarks
Obfuscating Remote Training
Problem Setup
Input Obfuscation
First-Layer Compensation
Hidden-Layer Obfuscation via Permutation and Diagonal Scaling
...and 16 more sections

Key Result

Lemma 1

Let $P \in \mathbb{R}^{d_k \times d_k}$ be any invertible matrix. Define transformed parameters Then the attention scores and attention weights are unchanged, i.e.,

Theorems & Definitions (6)

Lemma 1: Query--Key Bilinear Invariance
proof
Lemma 2: Value--Output Linear Invariance
proof
Theorem 1: Permutation Equivariance
proof

On a group of invariances in a class of functions

TL;DR

Abstract

On a group of invariances in a class of functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (6)