Table of Contents
Fetching ...

Stochastic Neural Network Symmetrisation in Markov Categories

Rob Cornish

TL;DR

The paper addresses preserving and upgrading symmetry in neural networks when outputs may be stochastic. It reframes equivariance and symmetrisation within Markov categories, defining a general, compositional framework that upgrades $H$-equivariant morphisms to $G$-equivariant ones via restriction functors and coset actions. A key contribution is a two-step end-to-end methodology that uses $L_\\varphi R_\\varphi$ and a precomposition by $\\Gamma$ to construct symmetrisation procedures, with results guaranteeing stability and, under suitable conditions, surjectivity. The framework unifies and extends prior deterministic methods (canonicalisation, frame averaging) and introduces stochastic symmetrisation for Markov kernels, enabling exact sampling and data-augmentation-like behaviour without expensive averaging. Empirically, the approach yields competitive or superior performance on synthetic tasks and provides a principled, scalable way to enforce symmetry in complex ML systems while preserving probabilistic structure and interpretability.

Abstract

We consider the problem of symmetrising a neural network along a group homomorphism: given a homomorphism $\varphi : H \to G$, we would like a procedure that converts $H$-equivariant neural networks to $G$-equivariant ones. We formulate this in terms of Markov categories, which allows us to consider neural networks whose outputs may be stochastic, but with measure-theoretic details abstracted away. We obtain a flexible and compositional framework for symmetrisation that relies on minimal assumptions about the structure of the group and the underlying neural network architecture. Our approach recovers existing canonicalisation and averaging techniques for symmetrising deterministic models, and extends to provide a novel methodology for symmetrising stochastic models also. Beyond this, our findings also demonstrate the utility of Markov categories for addressing complex problems in machine learning in a conceptually clear yet mathematically precise way.

Stochastic Neural Network Symmetrisation in Markov Categories

TL;DR

The paper addresses preserving and upgrading symmetry in neural networks when outputs may be stochastic. It reframes equivariance and symmetrisation within Markov categories, defining a general, compositional framework that upgrades -equivariant morphisms to -equivariant ones via restriction functors and coset actions. A key contribution is a two-step end-to-end methodology that uses and a precomposition by to construct symmetrisation procedures, with results guaranteeing stability and, under suitable conditions, surjectivity. The framework unifies and extends prior deterministic methods (canonicalisation, frame averaging) and introduces stochastic symmetrisation for Markov kernels, enabling exact sampling and data-augmentation-like behaviour without expensive averaging. Empirically, the approach yields competitive or superior performance on synthetic tasks and provides a principled, scalable way to enforce symmetry in complex ML systems while preserving probabilistic structure and interpretability.

Abstract

We consider the problem of symmetrising a neural network along a group homomorphism: given a homomorphism , we would like a procedure that converts -equivariant neural networks to -equivariant ones. We formulate this in terms of Markov categories, which allows us to consider neural networks whose outputs may be stochastic, but with measure-theoretic details abstracted away. We obtain a flexible and compositional framework for symmetrisation that relies on minimal assumptions about the structure of the group and the underlying neural network architecture. Our approach recovers existing canonicalisation and averaging techniques for symmetrising deterministic models, and extends to provide a novel methodology for symmetrising stochastic models also. Beyond this, our findings also demonstrate the utility of Markov categories for addressing complex problems in machine learning in a conceptually clear yet mathematically precise way.
Paper Structure (71 sections, 25 theorems, 202 equations, 2 figures)

This paper contains 71 sections, 25 theorems, 202 equations, 2 figures.

Key Result

Proposition 3.18

Let $p(y|x)$ be conditional density given $x \in X$ with respect to some base measure $\mu$ on $Y$, and denote by $k : X \to Y$ the Markov kernel this induces, namely where $B \subseteq Y$ is measurable and $x \in X$. Suppose $G$ is a group acting on $X$ and $Y$ in $\mathsf{Meas}$ such that eq:stochastic-equivariance-density-definition holds, and that moreover $g \cdot \mu = \mu$ for all $g \in G

Figures (2)

  • Figure 1: An illustration of stochastic equivariance. Here $X$ is a space of images, $Y$ is a space of coordinates, and the group $G$ consists of 2D rotations and translations. The model $k : X \to Y$ produces a noisy estimate of the location of the banana in its input, with repeated samples depicted here in blue. Stochastic equivariance means that the overall distribution of these samples varies with the action of the group as shown. This is distinct from equivariance at the level of individual samples, which is a more rigid constraint.
  • Figure 2: Average test loss obtained after training each model across the range of problem dimensions $d$ considered.

Theorems & Definitions (137)

  • Remark 2.1
  • Remark 2.2
  • Definition 3.1
  • Example 3.2
  • Example 3.3
  • Example 3.4
  • Definition 3.5
  • Example 3.6
  • Remark 3.7
  • Example 3.8
  • ...and 127 more