Table of Contents
Fetching ...

Learning Equivariant Functions via Quadratic Forms

Pavan Karjol, Vivek V Kashyap, Rohan Kashyap, Prathosh A P

TL;DR

The paper proposes a quadratic-form framework to learn $G$-equivariant functions by discovering a preserved quadratic form $x^T A x$ and exploiting orthogonal groups that preserve it. It proves a canonical decomposition of equivariant maps into a norm-invariant component and a scale-invariant component, and extends this to diagonal actions on tuples via Gram matrices. A symmetry-discovery mechanism learns $A$ (and thus the underlying group) while fitting data, yielding an explicit, interpretable predictor. Empirical results across synthetic regression, inertia prediction, and top-quark tagging demonstrate accurate symmetry discovery and strong equivariant learning, with notable performance on Lorentz-structure tasks. The approach offers a principled, interpretable alternative to broader symmetry-learning methods, with potential extensions beyond orthogonal groups envisioned for future work.

Abstract

In this study, we introduce a method for learning group (known or unknown) equivariant functions by learning the associated quadratic form $x^T A x$ corresponding to the group from the data. Certain groups, known as orthogonal groups, preserve a specific quadratic form, and we leverage this property to uncover the underlying symmetry group under the assumption that it is orthogonal. By utilizing the corresponding unique symmetric matrix and its inherent diagonal form, we incorporate suitable inductive biases into the neural network architecture, leading to models that are both simplified and efficient. Our approach results in an invariant model that preserves norms, while the equivariant model is represented as a product of a norm-invariant model and a scale-invariant model, where the ``product'' refers to the group action. Moreover, we extend our framework to a more general setting where the function acts on tuples of input vectors via a diagonal (or product) group action. In this extension, the equivariant function is decomposed into an angular component extracted solely from the normalized first vector and a scale-invariant component that depends on the full Gram matrix of the tuple. This decomposition captures the inter-dependencies between multiple inputs while preserving the underlying group symmetry. We assess the effectiveness of our framework across multiple tasks, including polynomial regression, top quark tagging, and moment of inertia matrix prediction. Comparative analysis with baseline methods demonstrates that our model consistently excels in both discovering the underlying symmetry and efficiently learning the corresponding equivariant function.

Learning Equivariant Functions via Quadratic Forms

TL;DR

The paper proposes a quadratic-form framework to learn -equivariant functions by discovering a preserved quadratic form and exploiting orthogonal groups that preserve it. It proves a canonical decomposition of equivariant maps into a norm-invariant component and a scale-invariant component, and extends this to diagonal actions on tuples via Gram matrices. A symmetry-discovery mechanism learns (and thus the underlying group) while fitting data, yielding an explicit, interpretable predictor. Empirical results across synthetic regression, inertia prediction, and top-quark tagging demonstrate accurate symmetry discovery and strong equivariant learning, with notable performance on Lorentz-structure tasks. The approach offers a principled, interpretable alternative to broader symmetry-learning methods, with potential extensions beyond orthogonal groups envisioned for future work.

Abstract

In this study, we introduce a method for learning group (known or unknown) equivariant functions by learning the associated quadratic form corresponding to the group from the data. Certain groups, known as orthogonal groups, preserve a specific quadratic form, and we leverage this property to uncover the underlying symmetry group under the assumption that it is orthogonal. By utilizing the corresponding unique symmetric matrix and its inherent diagonal form, we incorporate suitable inductive biases into the neural network architecture, leading to models that are both simplified and efficient. Our approach results in an invariant model that preserves norms, while the equivariant model is represented as a product of a norm-invariant model and a scale-invariant model, where the ``product'' refers to the group action. Moreover, we extend our framework to a more general setting where the function acts on tuples of input vectors via a diagonal (or product) group action. In this extension, the equivariant function is decomposed into an angular component extracted solely from the normalized first vector and a scale-invariant component that depends on the full Gram matrix of the tuple. This decomposition captures the inter-dependencies between multiple inputs while preserving the underlying group symmetry. We assess the effectiveness of our framework across multiple tasks, including polynomial regression, top quark tagging, and moment of inertia matrix prediction. Comparative analysis with baseline methods demonstrates that our model consistently excels in both discovering the underlying symmetry and efficiently learning the corresponding equivariant function.

Paper Structure

This paper contains 58 sections, 14 theorems, 92 equations, 3 figures, 5 tables.

Key Result

Theorem 4.1

Let $A$ be a non-zero $n \times n$ matrix, and define $U := \mathopen{}\left\{ { x \in \mathbb{R}^n : x^T A x = 0 }_{{}_{}}\,\right\}\mathclose{}$. Let $G$ be an orthogonal group that preserves the quadratic form $x^T A x$, and consider the action of $G$ on $\mathbb{R}^n \setminus U$. For a given $c

Figures (3)

  • Figure 1: Schematic representation of the proposed method. $\phi_n$ and $\phi_s$ denote neural networks, where the norm-invariant network takes input norms, and the scale-invariant network takes normalized inputs where $\Vert x \Vert_A = \operatorname{sign}(x^T A x) \sqrt{\vert x^T A x \vert}$.
  • Figure 2: For an $O(2)$-equivariant function $f$, we have $f(x) = f\mathopen{}\left( {R(\theta)r_1 e_1}_{{}_{}}\,\right)\mathclose{} = \underbrace{R(\theta)}{}f\mathopen{}\left( {r_1 e_1}_{{}_{}}\,\right)\mathclose{}$. Similarly, $f(y) = \underbrace{R(\theta)}{}f\mathopen{}\left( {r_2 e_1}_{{}_{}}\,\right)\mathclose{}; \: f(z) = \underbrace{R(\theta)}{}f\mathopen{}\left( {r_3 e_1}_{{}_{}}\,\right)\mathclose{}.$ Hence, $\phi_s(x) = \phi_s(y) = \phi_s(z) = R(\theta)$. This property holds for general orthogonal groups as well.
  • Figure 3: Comparison of ground-truth and learned $A$ matrices for top-tagging (top) and synthetic regression (bottom) using $G$-Ortho-Nets.

Theorems & Definitions (33)

  • Definition 3.1: Group
  • Definition 3.2: Group Action
  • Definition 3.3: Orbit
  • Definition 3.4: General Linear Group
  • Definition 3.5: Equivariant Function
  • Theorem 4.1
  • Theorem 4.2
  • Remark 4.1
  • Remark 4.2
  • Remark 4.3
  • ...and 23 more