Table of Contents
Fetching ...

MatrixNet: Learning over symmetry groups using learned group representations

Lucas Laird, Circe Hsu, Asilata Bapat, Robin Walters

TL;DR

MatrixNet introduces learned matrix representations for group elements to enable neural models to operate directly on symmetry groups. By encoding generators as invertible matrices and composing them multiplicatively, the approach enforces group structure through matrix blocks and an auxiliary relation loss, with variants that incorporate linear, nonlinear, and block-diagonal designs. Empirical results show strong performance on order prediction in finite groups and on braid-group predictions, with excellent generalization to longer words and to unseen group elements. The work advances learning over algebraic structures, offering improved data efficiency and potential interpretability through connection to irreducible subspaces and categorical braid actions.

Abstract

Group theory has been used in machine learning to provide a theoretically grounded approach for incorporating known symmetry transformations in tasks from robotics to protein modeling. In these applications, equivariant neural networks use known symmetry groups with predefined representations to learn over geometric input data. We propose MatrixNet, a neural network architecture that learns matrix representations of group element inputs instead of using predefined representations. MatrixNet achieves higher sample efficiency and generalization over several standard baselines in prediction tasks over the several finite groups and the Artin braid group. We also show that MatrixNet respects group relations allowing generalization to group elements of greater word length than in the training set.

MatrixNet: Learning over symmetry groups using learned group representations

TL;DR

MatrixNet introduces learned matrix representations for group elements to enable neural models to operate directly on symmetry groups. By encoding generators as invertible matrices and composing them multiplicatively, the approach enforces group structure through matrix blocks and an auxiliary relation loss, with variants that incorporate linear, nonlinear, and block-diagonal designs. Empirical results show strong performance on order prediction in finite groups and on braid-group predictions, with excellent generalization to longer words and to unseen group elements. The work advances learning over algebraic structures, offering improved data efficiency and potential interpretability through connection to irreducible subspaces and categorical braid actions.

Abstract

Group theory has been used in machine learning to provide a theoretically grounded approach for incorporating known symmetry transformations in tasks from robotics to protein modeling. In these applications, equivariant neural networks use known symmetry groups with predefined representations to learn over geometric input data. We propose MatrixNet, a neural network architecture that learns matrix representations of group element inputs instead of using predefined representations. MatrixNet achieves higher sample efficiency and generalization over several standard baselines in prediction tasks over the several finite groups and the Artin braid group. We also show that MatrixNet respects group relations allowing generalization to group elements of greater word length than in the training set.
Paper Structure (47 sections, 2 theorems, 11 equations, 5 figures, 5 tables)

This paper contains 47 sections, 2 theorems, 11 equations, 5 figures, 5 tables.

Key Result

Proposition 1

Matrix Block defines a representation of the free group.

Figures (5)

  • Figure 1: Schematic of MatrixNet for predicting order of elements of $S_3$. Input generators $\sigma_1$ and $\sigma_2$ are mapped to learned representations and sequentially multiplied to provide a matrix representation of group element $g$. The order is then predicted by the task model which is an MLP.
  • Figure 2: Length extrapolation results. Left: The plot shows how MSE grows for increasing word lengths ($y$-axis is truncated for clarity). Right: The plot shows how the average accuracy decays for increasing word lengths. The relatively high accuracy of MatrixNet and MatrixNet-MC compared to baselines suggests that the high MSE is caused by outliers with multiplicity predictions much higher than the ground truth.
  • Figure 3: Visualization of learned matrix representations. The first two figures show the representations for the generators of $B_{3}$. The last two figures show the representation for equivalent words that are generated by the relations of $B_{3}$.
  • Figure 4: The Dynkin graph of type $A_n$.
  • Figure 5: The doubled quiver $\Gamma_n^\text{dbl}$ of the Dynkin graph of type $A_n$.

Theorems & Definitions (5)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Remark A.1