Approximation and learning with compositional tensor trains
Martin Eigel, Charles Miranda, Anthony Nouy, David Sommer
TL;DR
The paper introduces compositional tensor trains (CTTs) to unify expressivity and efficiency for high-dimensional function approximation by composing low-rank TT layers. It formalizes the CTT framework, demonstrates universal approximation capabilities under mild bases, and provides compression guarantees for layer-wise representations. Two optimization strategies are developed: a Pontryagin-maximum-principle-based approach (via the method of successive approximation) and a layerwise natural-gradient method, both leveraging tensor structure and stable regularization. Numerical experiments validate the approach on regression-style tasks, illustrate the benefits of low-rank implementations, and explore the role of random sketching in managing ill-conditioned Gram matrices. Overall, the work proposes a scalable, expressive alternative to standard deep neural networks by marrying compositional models with tensor-algebraic computation.
Abstract
We introduce compositional tensor trains (CTTs) for the approximation of multivariate functions, a class of models obtained by composing low-rank functions in the tensor-train format. This format can encode standard approximation tools, such as (sparse) polynomials, deep neural networks (DNNs) with fixed width, or tensor networks with arbitrary permutation of the inputs, or more general affine coordinate transformations, with similar complexities. This format can be viewed as a DNN with width exponential in the input dimension and structured weights matrices. Compared to DNNs, this format enables controlled compression at the layer level using efficient tensor algebra. On the optimization side, we derive a layerwise algorithm inspired by natural gradient descent, allowing to exploit efficient low-rank tensor algebra. This relies on low-rank estimations of Gram matrices, and tensor structured random sketching. Viewing the format as a discrete dynamical system, we also derive an optimization algorithm inspired by numerical methods in optimal control. Numerical experiments on regression tasks demonstrate the expressivity of the new format and the relevance of the proposed optimization algorithms. Overall, CTTs combine the expressivity of compositional models with the algorithmic efficiency of tensor algebra, offering a scalable alternative to standard deep neural networks.
