Table of Contents
Fetching ...

Riemannian Optimization on Tree Tensor Networks with Application in Machine Learning

Marius Willner, Marco Trenti, Dirk Lebiedz

TL;DR

The paper addresses optimization over tree tensor networks (TTNs) by formulating TTN-parameter spaces as quotient manifolds to remove gauge ambiguity. It develops two explicit horizontal spaces (Cartesian and orthogonal) and corresponding projectors, enabling first- and second-order Riemannian optimization (gradient descent and Newton/trust-region) on TTN quotients, along with a backpropagation scheme for kernel learning. The authors provide explicit gradient, Hessian, and retraction formulations tailored to TTN geometry and demonstrate their effectiveness on a handwritten digits classification task, achieving high accuracies and favorable training speed, especially when using the non-orthogonal horizontal space. The work offers a principled, scalable framework for TTN-based machine learning and sets the stage for extensions to more complex tensor networks and stochastic optimization strategies. The proposed approach has practical impact for kernel learning and ML tasks where structured low-rank representations are advantageous, combining differential geometry with efficient recursive TTN computations.

Abstract

Tree tensor networks (TTNs) are widely used in low-rank approximation and quantum many-body simulation. In this work, we present a formal analysis of the differential geometry underlying TTNs. Building on this foundation, we develop efficient first- and second-order optimization algorithms that exploit the intrinsic quotient structure of TTNs. Additionally, we devise a backpropagation algorithm for training TTNs in a kernel learning setting. We validate our methods through numerical experiments on a representative machine learning task.

Riemannian Optimization on Tree Tensor Networks with Application in Machine Learning

TL;DR

The paper addresses optimization over tree tensor networks (TTNs) by formulating TTN-parameter spaces as quotient manifolds to remove gauge ambiguity. It develops two explicit horizontal spaces (Cartesian and orthogonal) and corresponding projectors, enabling first- and second-order Riemannian optimization (gradient descent and Newton/trust-region) on TTN quotients, along with a backpropagation scheme for kernel learning. The authors provide explicit gradient, Hessian, and retraction formulations tailored to TTN geometry and demonstrate their effectiveness on a handwritten digits classification task, achieving high accuracies and favorable training speed, especially when using the non-orthogonal horizontal space. The work offers a principled, scalable framework for TTN-based machine learning and sets the stage for extensions to more complex tensor networks and stochastic optimization strategies. The proposed approach has practical impact for kernel learning and ML tasks where structured low-rank representations are advantageous, combining differential geometry with efficient recursive TTN computations.

Abstract

Tree tensor networks (TTNs) are widely used in low-rank approximation and quantum many-body simulation. In this work, we present a formal analysis of the differential geometry underlying TTNs. Building on this foundation, we develop efficient first- and second-order optimization algorithms that exploit the intrinsic quotient structure of TTNs. Additionally, we devise a backpropagation algorithm for training TTNs in a kernel learning setting. We validate our methods through numerical experiments on a representative machine learning task.

Paper Structure

This paper contains 25 sections, 8 theorems, 99 equations, 6 figures, 1 table, 4 algorithms.

Key Result

Proposition 2.6

The space of orthogonal parameters $\mathcal{T} = \{x \in \mathcal{E}^*: x \text{ is orthogonal}\}$ is an embedded submanifold of the Euclidean space $\mathcal{E}$.

Figures (6)

  • Figure 1: \ref{['eq:ttn_recursion', 'eq:ttn_tensor']} in tensor network diagram notation
  • Figure 1: The total space $\mathcal{M}$ and the quotient space $\mathcal{M/G}$. Both $x$ and $y \in \mathcal{G}_x$ map to the same point $[x]$ under the quotient map $\pi$. Vertical spaces run tangent to the orbit $\mathcal{G}_x$. Horizontal spaces along $\mathcal{G}_x$ are compatible with $\mathrm{d} \theta_\mathcal{A}$. $\xi_x^h$ and $\xi_y^h$ are the unique horizontal lifts of $\xi_{[x]}$ to $x$ and $y$, respectively. Adapted from Wikimedia01.
  • Figure 1: forward propagation of a sample and component $\delta \mathbf{B}_{1,2,3,4}$ of the Euclidean gradient
  • Figure 1: 2000 iterations of RGD for diverse approaches and retractions
  • Figure 2: 200 iterations of RTR for diverse approaches and retractions
  • ...and 1 more figures

Theorems & Definitions (24)

  • Definition 2.1
  • Definition 2.2
  • Remark 2.3
  • Definition 2.4
  • Definition 2.5
  • Proposition 2.6
  • Proof 1
  • Proposition 3.1
  • Proof 2
  • Definition 3.2
  • ...and 14 more