Table of Contents
Fetching ...

Group-invariant tensor train networks for supervised learning

Brent Sprangers, Nick Vannieuwenhoven

TL;DR

This work tackles incorporating group invariance into tensor-train networks (TTNs) for supervised learning. It introduces a new, efficient algorithm to construct an orthonormal basis of $G$-invariant tensors under normal representations by solving a reduced joint eigenproblem, enabling $G$-invariant TTNs with lower memory and computation. The method yields substantial speedups over prior invariant-basis approaches and is validated on parity classification and transcription-factor binding tasks, demonstrating competitive predictive performance while reducing parameter counts. By exploiting problem-specific symmetries such as reverse-complement invariance in DNA, the approach provides a scalable, symmetry-guided learning framework with practical impact in computational biology and beyond.

Abstract

Invariance has recently proven to be a powerful inductive bias in machine learning models. One such class of predictive or generative models are tensor networks. We introduce a new numerical algorithm to construct a basis of tensors that are invariant under the action of normal matrix representations of an arbitrary discrete group. This method can be up to several orders of magnitude faster than previous approaches. The group-invariant tensors are then combined into a group-invariant tensor train network, which can be used as a supervised machine learning model. We applied this model to a protein binding classification problem, taking into account problem-specific invariances, and obtained prediction accuracy in line with state-of-the-art deep learning approaches.

Group-invariant tensor train networks for supervised learning

TL;DR

This work tackles incorporating group invariance into tensor-train networks (TTNs) for supervised learning. It introduces a new, efficient algorithm to construct an orthonormal basis of -invariant tensors under normal representations by solving a reduced joint eigenproblem, enabling -invariant TTNs with lower memory and computation. The method yields substantial speedups over prior invariant-basis approaches and is validated on parity classification and transcription-factor binding tasks, demonstrating competitive predictive performance while reducing parameter counts. By exploiting problem-specific symmetries such as reverse-complement invariance in DNA, the approach provides a scalable, symmetry-guided learning framework with practical impact in computational biology and beyond.

Abstract

Invariance has recently proven to be a powerful inductive bias in machine learning models. One such class of predictive or generative models are tensor networks. We introduce a new numerical algorithm to construct a basis of tensors that are invariant under the action of normal matrix representations of an arbitrary discrete group. This method can be up to several orders of magnitude faster than previous approaches. The group-invariant tensors are then combined into a group-invariant tensor train network, which can be used as a supervised machine learning model. We applied this model to a protein binding classification problem, taking into account problem-specific invariances, and obtained prediction accuracy in line with state-of-the-art deep learning approaches.
Paper Structure (16 sections, 5 theorems, 53 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 16 sections, 5 theorems, 53 equations, 4 figures, 4 tables, 1 algorithm.

Key Result

Proposition 2.3

\newlabelprop_invariance0 Let $f : \mathrm{W}_{1}\times\cdots\times\mathrm{W}_k \to \mathrm{W}_{k+1}$ be a multilinear map, $\bm{\mathcal{F}}\in\mathrm{W}_1^*\otimes\cdots\otimes\mathrm{W}_k^*\otimes\mathrm{W}_{k+1}$ the associated tensor, $G=\langle g_1,\ldots,g_s\rangle$ a finitely-generated gro where $\rho^*(g) = \rho^{-\top}(g)$ is the dual representation.

Figures (4)

  • Figure 1: Illustration of how tensor networks can be used as machine learning models in graphical notation Bridgeman_2017. The grey nodes denote the rank-1 input tensor $\Phi({\bm{x}})$ formed by the tensor product of the local feature maps ${\bm{\phi}}_i(x_i)$. The contraction of $\Phi({\bm{x}})$ with the feature tensor (the rectangles with rounded corners) yields the feature vector (the single output edge at the top of both graphs). In the left figure, the feature tensor is a single order-$6$ tensor ($5$ input spaces and one output space). In the right figure, the feature tensor has a TTN structure and is represented by two matrices (the leftmost and rightmost rectangles), an order-$4$ tensor (middle rectangle), and two order-$3$ tensors.
  • Figure 1: Construction times for group-invariant tensor bases.
  • Figure 2: Parity classification results (both on training and validation set) on $100$ runs after training for $100$ epochs.
  • Figure 3: RC-invariant TTN architecture and building block constraints in Penrose graphical notation Bridgeman_2017. The direction of the arrows indicates whether the vector space is a dual space or not. The constraints are invariance constraints as introduced earlier with an extra transposition of the bond indices to account for the reverse operation of the symmetry. To arrive at a model that is RC-invariant, the trivial representation is taken on the output vector space.

Theorems & Definitions (13)

  • Example 2.1
  • Example 2.2
  • Proposition 2.3
  • Proof 1
  • Remark 2.4
  • Definition 2.5: $G$-invariant tensor
  • Lemma 3.1: Singh, Pfeifer, and Vidal singh2010tensor
  • Proposition 3.2
  • Proof 2
  • Lemma 4.1: Davis davis1979circulant
  • ...and 3 more