Table of Contents
Fetching ...

Compressing multivariate functions with tree tensor networks

Joseph Tindall, E. Miles Stoudenmire, Ryan Levy

TL;DR

This work extends tensor-network methods for continuous, high-dimensional functions by introducing tree tensor networks (TTNs) as a flexible alternative to tensor trains. It develops direct TTN constructions for elementary functions and a tree-generalized tensor cross interpolation (TCI) to learn multivariate targets, showing structured TTNs capture inter-dimensional correlations far more efficiently than tensor trains. The authors apply TTNs to nonlinear Fredholm integral equations, proving rank-based bounds that guarantee exponential accuracy with tree size for certain kernels. Together, these results broaden the applicability of tensor-network techniques to continuum problems and offer scalable, topology-aware tools with open-source software support.

Abstract

Tensor networks are a compressed format for multi-dimensional data. One-dimensional tensor networks -- often referred to as tensor trains (TT) or matrix product states (MPS) -- are increasingly being used as a numerical ansatz for continuum functions by ``quantizing'' the inputs into discrete binary digits. Here we demonstrate the power of more general tree tensor networks for this purpose. We provide direct constructions of a number of elementary functions as generic tree tensor networks and interpolative constructions for more complicated functions via a generalization of the tensor cross interpolation algorithm. For a range of multi-dimensional functions we show how more structured tree tensor networks offer a significantly more efficient ansatz than the commonly used tensor train. We demonstrate an application of our methods to solving multi-dimensional, non-linear Fredholm equations, providing a rigorous bound on the rank of the solution which, in turn, guarantees exponentially scaling accuracy with the size of the tree tensor network for certain problems.

Compressing multivariate functions with tree tensor networks

TL;DR

This work extends tensor-network methods for continuous, high-dimensional functions by introducing tree tensor networks (TTNs) as a flexible alternative to tensor trains. It develops direct TTN constructions for elementary functions and a tree-generalized tensor cross interpolation (TCI) to learn multivariate targets, showing structured TTNs capture inter-dimensional correlations far more efficiently than tensor trains. The authors apply TTNs to nonlinear Fredholm integral equations, proving rank-based bounds that guarantee exponential accuracy with tree size for certain kernels. Together, these results broaden the applicability of tensor-network techniques to continuum problems and offer scalable, topology-aware tools with open-source software support.

Abstract

Tensor networks are a compressed format for multi-dimensional data. One-dimensional tensor networks -- often referred to as tensor trains (TT) or matrix product states (MPS) -- are increasingly being used as a numerical ansatz for continuum functions by ``quantizing'' the inputs into discrete binary digits. Here we demonstrate the power of more general tree tensor networks for this purpose. We provide direct constructions of a number of elementary functions as generic tree tensor networks and interpolative constructions for more complicated functions via a generalization of the tensor cross interpolation algorithm. For a range of multi-dimensional functions we show how more structured tree tensor networks offer a significantly more efficient ansatz than the commonly used tensor train. We demonstrate an application of our methods to solving multi-dimensional, non-linear Fredholm equations, providing a rigorous bound on the rank of the solution which, in turn, guarantees exponentially scaling accuracy with the size of the tree tensor network for certain problems.
Paper Structure (12 sections, 12 equations, 8 figures, 1 algorithm)

This paper contains 12 sections, 12 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: a-c) Representations of the two dimensional function $f(\mathbf{x}) = f(x_{1}, x_{2})$ with the continuous variables $x_{1}, x_{2} \in [0,1)$ encoded as binary strings $x_{1} = 0.x_{1,1}x_{1,2}x_{1,3}x_{1,4}$ and $x_{2} = 0.x_{2,1}x_{2,2}x_{2,3}x_{2,4}$ of length four. a) The actual values of the function over the domain specified by the binary digits can be encoded as a single order $8$ tensor. b) Quantics tensor train or Matrix Product State representation of the order $8$ tensor with a binary digit ordering commonly referred to as 'sequential'. c) An example of a tree tensor network (TTN) representation of the order $8$ tensor. d) Top: We use the notation $\mathcal{T}_{(i,j)}$ to refer to the local tensor in a tree corresponding to the binary digit $x_{i,j}$ -- the $j$th most significant binary digit in the decomposition of the $i$th continuous variable $x_{i}$. This tensor has $z_{(i,j)}$ 'virtual' indices (black lines) connecting it to its neighbors in the tree and a single external index (dotted line) corresponding to the binary variable $x_{(i,j)} \in \{0,1\}$. Bottom: We use $\mathcal{T}_{(i,j)}(x_{i,j})$ to refer to the order $z_{(i,j)}$ tensor which is a slice of $\mathcal{T}_{(i,j)}$ for the given value of $x_{i,j}$.
  • Figure 2: The interpolative gauge. a) Tree tensor network for a two dimensional function $f(\mathbf{x}) = f(x_{1}, x_{2})$ with the external indices decomposing the two continuous variables into binary strings of length three. b) The settings on the virtual indices $\alpha_{1}, \alpha_{2}, \alpha_{3}, \hdots$ of a local tensor map to configurations of the binary digits which the given edge connects to that local tensor. An example mapping is given by the tables shown. c) In the interpolative gauge, one of the tensors in the TTN has the property that their elements, for a given configuration of their virtual indices and external index, are equivalent to the contraction of the whole network for a specific setting of all the binary digits. This information is encoded via the mapping between virtual indices and external ones. The tensor cross interpolation (TCI) algorithm is an active learning algorithm which utilizes this gauge to move through the network, changing the gauge centre and updating neighboring pairs of local tensors in order minimize the infinity norm between their values $\mathcal{T}_{(i,j)}(x_{i,j})_{\boldsymbol{\alpha}}$ and some set of interpolation points of a target function $f(\mathbf{x})$.
  • Figure 3: Comparison of different tree tensor networks, with $L = 16$ vertices, for compressing two different one-dimensional functions $f(x)$ with a binary decomposition $x = 0.x_{1}x_{2}...x_{16}$. Error $\epsilon$ is calculated via Eq.(\ref{['Eq:Errors']}), sampling the function over $10^{3}$ random grid points. The different labelled trees used are shown at the top and the bits are numbered from most significant to least significant: with the most significant digit circled in red. Two functions are considered. Top plots) the Laguerre polynomial $f(x) = L_{n}(x) = \sum_{k = 0}^{n} \binom{n}{k}\frac{(-1)^{k}}{k!}x^{k}$ with $n = 40$ and Bottom plots) the Weierstrass function $f(x) = \sum_{k=1}^{n}\frac{\sin(\pi k^{a} x)}{\pi k^{a}}$ with $a = 3$ and $n = 25$. Left panels: Sketch of the functions considered over $x \in [0,1]$. Inset of d) shows a zoomed-in region of the function with data points corresponding to the values obtained from the Bident tree tensor network with $\chi = 36$. Middle panels: Error versus bond dimension of the tensor networks. Right panels: Error versus memory requirement for storing the tensor networks ---assuming $8$ bytes for a floating point number.
  • Figure 4: Comparison of different tree tensor networks for compressing the three dimensional function $f(\mathbf{r}) = \sum_{j=1}^{n}\cos(j \mathbf{k_{j}} \cdot \mathbf{r})$ with $n= 30$, $\mathbf{r} = (x,y,z)$ and $\mathbf{k_{j}} = (k^{x}_{j}, k^{y}_{j}, k^{z}_{j})$ with $k^{\alpha}_{j} \in \mathcal{N}(0,1)$. Error is calculated as $\epsilon$ via Eq.(\ref{['Eq:Errors']}), sampling the function over $10^{3}$ random grid points. Different trees used are shown at the top. Bits for $x$, $y$ and $z$ are coloured in green, grey and light blue respectively and numbered sequentially from most significant to least significant. a) Error versus bond dimension. Inset shows a heatmap of the function at $z = \frac{1}{2}$. b) Error versus memory requirement for storing the tensor networks --- assuming $8$ bytes for a single floating point number. c) Mutual information matrix encoding the correlations between the binary digits. Calculated by sampling the function $10^{4}$ times to build up an approximate reduced density matrix for the given paur of bits. Each dashed block encodes the matrix of correlations between the bits of two given dimensions. d) Absolute error over one-dimensional slices of the function. Bar chart shows memory requirements on the right for the specified TTN with the maximum bond dimension of the network also shown. Inset (black line) shows a slice of the function for $y = 1/2$ and $z = 1/2$, with corresponding results for the coupled binary tree tensor network at $\chi =18$.
  • Figure 5: Comparison of the effectiveness of different tree tensor networks with $L = 16$ bits per dimension for learning the multinormal probability density function $f(\mathbf{r}) \propto \exp( - ((\mathbf{r} - \mathbf{\mu})^{T} M^{-1} (\mathbf{r} - \mathbf{\mu}))$ via the tensor cross interpolation algorithm. Here $M$ is an $n \times n$ covariance matrix and $\mathbf{\mu} = (\mu_{1}, \mu_{2}, ..., \mu_{n})$ is the mean vector. We consider $n =3$ with $\mathbf{r} = (x,y,z) \in [0,10)^{3}$ and $\mathbf{\mu} = (5,5,5)$. Results are obtained from drawing $N = 10$ instances of $M$ ($M_{1}, M_{2}, \hdots M_{10}$) from the LKJ distribution Lewandoski2009 with shape parameter $\eta = 50$. Different trees used are shown at the top. Bits for $x$, $y$ and $z$ are coloured in green, grey and light blue respectively. The most significant digit in each dimension is circled in red. a-b) Error $\epsilon$, calculated via Eq. (\ref{['Eq:Errors']}) after $n = 10$ sweeps of the TCI algorithm, versus bond dimension and memory cost for the tensor networks. The solid lines shows the mode of the error over the $10$ realizations of $M$ whilst the shaded area shows the range of the error, i.e. for any $M$ sampled the error $\epsilon$ for a given bond dimension. Inset) shows a heatmap of the function over $(x,z) \in [3,7]^{2}$ with $y = \frac{1}{2}$. c) Average value for the infinity norm $\epsilon_{\infty}$ (see Eq. (\ref{['Eq:Errors']})) over a given sweep of the TCI algorithm for $M = M_{1}$ and the three tensor networks at the specified bond dimensions and memory cost.
  • ...and 3 more figures